Peter C. (Benchwork)'s picture

Because turnkey is such an awesome Idea/System, I want to see if there is a way to take it one step further. 
the Idea seems simple on the surface, but I know that it would be more than a little complex.  In a nutshell, the idea is, when generating the ISO file for a particular appliance, it would also create some or all of the other containers inside the ISO with it.  eg.  the file turnkey-lxc-13.0-wheezy-amd64.iso would also have all the files for the supported formats
          vmdk, OVF, OpenStack, OpenVZ, OpenNode, and Xen.

now I know this sounds like a uber bloat idea, but hold on. here is the actual idea.
because its happening at the time of generating the ISO we might be able to deduplicate the data.
Please read the info that I gleaned from different sites (sorry no sources linked) that give additional information and concepts to explain it better.

Data deduplication (often called "intelligent compression" or "single-instance storage")
is a method of reducing storage needs by eliminating redundant data. Only one unique instance
of the data is actually retained on storage media, such as disk or tape. Redundant data is
replaced with a pointer to the unique data copy. For example, a typical email system might
contain 100 instances of the same one megabyte (MB) file attachment.

The optimize parameter in the CDImage can possibly compress the contents of the generated ISO file by recording the duplicate files one time only, this greatly reduce size the X-in-1 CDs and enables 4 GB of data to fit in one CD.

This functionality is not built-in in the MKISOFS (it doesn't analyze the files to check which files are the duplicate one) but it can accept the hardlinks that point to the files to be included in the ISO file as pointers to the actual files (with the parameters '-follow-links -cache-inodes').
That means that a file and its hardlink will be encoded once in the ISO file (thus giving the same effect as the -o parameter in CDImage).


so the question is..... because each container basically has all the same files, can we find a way to create them in a way that the container files are filled with hardlinks to the files in the ISO, so that when extracted it is a fully populated container? 

the benifits to this are two fold, 
1. this could reduce the bandwidth requirements by turnkey. (always good)
2. this would be a universal image that can be deployed on Any of the environments. 
3. because its an all scripted build, this would lower human involvement in the process, and gaining a more consistent build.


now that I've explaind my thought, I want to know if you think its possible.
what information do we need?
what roadblocks are there? etc.

thanks for your time reading this. I look forward to your comments.

Jeremy Davis's picture

I quite like it. Although I'm not sure it's possible or practical.

I imagine that you could probably get something to work for the compressed filesystem images e.g. OpenVZ (I know it best) as they are basically just archived filesystems (which as you point out are also contained on the ISO). But it would lack efficiency as the ISO would then need to be run in live mode to recreate the OVZ archive (OVZ expects a tarball).

The OVF and VMDK images though are complete filesystems including LVM setup. They are essentially the ISO preinstalled into a HDD image rather than a real HDD. Again I guess you could script the ISO to create the VMDK/OVF in live mode but it sort of defeats the purpose of having a VMDK in the first place. You're probably better off just installing from the ISO...

So I like the idea but I think that it adds a fair bit of complexity and when looked at from the perspective of pragmatic execution I'm not sure...

FWIW we have toyed with the idea of distributing DVDs with multiple formats of appliances but never really got to the point of making it happen...

Peter C. (Benchwork)'s picture

the Idea I had envisioined, only involed mounting or opening the iso file regardless of platform, and just being able to copy the files as needed. all the complexity would be done at the time of creation.  from there it would just only have the data saved 

I guess the first thing I should do, is find out what tools are being used to build each image/container.
I will go digging for them but it will take some time. of if someone can tell me what the tools and the process are, I will see what I can come up with from there. 


Jeremy Davis's picture

Currently TKLDev makes ISOs by default and BuildTasks makes the other formats...

I think in the future it'd be better to make TKLDev directly output the desired format. Currently the process is a little convoluted...

I am not convinced that you will be able to do it. But I'm more than happy to be proven wrong! :)

Add new comment