alexsunny123's picture

Hello,

 

Because turnkey is such an awesome Idea/System, I want to see if there is a way to take it one step further. 
the Idea seems simple on the surface, but I know that it would be more than a little complex.  In a nutshell, the idea is, when generating the ISO file for a particular appliance, it would also create some or all of the other containers inside the ISO with it.  eg.  the file turnkey-lxc-13.0-wheezy-amd64.iso would also have all the files for the supported formats
          vmdk, OVF, OpenStack, OpenVZ, OpenNode, and Xen.

now I know this sounds like a uber bloat idea, but hold on. here is the actual idea.
because its happening at the time of generating the ISO we might be able to deduplicate the data.
Please read the info that I gleaned from different sites (sorry no sources linked) that give additional information and concepts to explain it better.

*******************************************************
Data deduplication (often called "intelligent compression" or "single-instance storage")
is a method of reducing storage needs by eliminating redundant data. Only one unique instance
of the data is actually retained on storage media, such as disk or tape. Redundant data is
replaced with a pointer to the unique data copy. For example, a typical email system might
contain 100 instances of the same one megabyte (MB) file attachment.

*******************************************************
The optimize parameter in the CDImage can possibly compress the contents of the generated ISO file by recording the duplicate files one time only, this greatly reduce size the X-in-1 CDs and enables 4 GB of data to fit in one CD.

This functionality is not built-in in the MKISOFS (it doesn't analyze the files to check which files are the duplicate one) but it can accept the hardlinks that point to the files to be included in the ISO file as pointers to the actual files (with the parameters '-follow-links -cache-inodes').
That means that a file and its hardlink will be encoded once in the ISO file (thus giving the same effect as the -o parameter in CDImage).

*******************************************************

so the question is..... because each container basically has all the same files, can we find a way to create them in a way that the container files are filled with hardlinks to the files in the ISO, so that when extracted it is a fully populated container? 


the benifits to this are two fold, 
1. this could reduce the bandwidth requirements by turnkey. (always good)
2. this would be a universal image that can be deployed on Any of the environments. 
3. because its an all scripted build, this would lower human involvement in the process, and gaining a more consistent build.

 

now that I've explaind my thought, I want to know if you think its possible.
what information do we need?
what roadblocks are there? etc.

thanks for your time reading this. I look forward to your comments.
Peter.

 

thanks

alexsunny

Forum: 
Tags: 
Jeremy Davis's picture

I'm not sure if this will work?! As you've possibly already realised, the ISO is designed to run as both live or install-able. As such, the root filesystem is stored on the ISO as a squashfs filesystem. The live functionality mounts the squashfs to RAM and runs from that. The install functionality mounts it to RAM and then copies the files to the (partitioned) disk.

Even if what you are suggesting can be achieved, I'm not clear how it would work and how user friendly it would be. E.g. how would someone who wanted to use the OVA be able to extract that from the ISO?

FWIW all the other builds are created from the ISO already. We unpack the squashfs and convert those same files to the other builds. We call the scripts that convert the ISO to other builds buildtasks. If you wanted to reduce your own bandwidth and build for different local targets, you could use them locally?!

Add new comment