1. Red Hat People
  2. Daniel Berrange
  3. One Laptop Per Child
  4. Image statistics

OLPC Image statistics

During construction of the operating system 'firmware' images for the OLPC platform, a number of reports are generated. These identify & guide us to areas of the distribution which are consuming excessive mount of disk space.

RPM size statistics

Running the 'rpm -qi [rpmname]' command shows information about an installed RPM package, including the total size of deployed files. During construction of the OS images, however, many files (man pages, documentation, irrelevant hardware drivers) are stripped out of the image. In addition the JFFS2 filesystem includes block compression of all files. So the 'rpm -qi' output is not a directly useful indication of RPM size in the output images. Thus, the mkfs.jffs2 command was extended to dump out information about the compressed size of each file written to the JFFS2 image. A little post-processing of this information & correlation with the RPM file listings, enabled generation of a report listing exact size consumed in the JFFS2 by each RPM. A snippet of the top 5 hits in the data are:

Initial Size Stripped Size JFFS2 size Package %saving w/ stripping %saving w/ compression
342062843420628415828661thunderbird-1.5-0.5.5.rc1.i386 0.00% 53.73%
279965012799538112935670firefox-1.5-4.i386 0.00% 53.80%
21446880213782139535347python-2.4.2-2.1.i386 0.32% 55.54%
32033610262743338851617gnome-applets-2.13.1-4.i386 17.98% 72.37%
29355147212457698719986perl-5.8.7-8.1.i386 27.63% 70.29%

Where, you might ask, is glibc-common which I can see taking up 65 MB on my desktop? The answer is in the complete report. While glibc-common might take up 65MB normally, once it is stripped down to what is required for this particular image build, it is at 15th place, with a mere 4.1 MB. This clearly illustrates the importance of not making hasty judgements based on information from a normal desktop install of Fedora Core. The file stripping & JFFS2 compression will alter the disk footprint of some packages, much more dramatically than others. With this report showing actual JFFS2 usage, we can concentrate of areas which will have the most impact on our target scenario.

Directory disk usage

Following on from the theme above, another way of looking at disk usage is to consider the problem on a per-directory basis, rather than per-package. The traditional tools for doing this are du for trawling the disk to report on usage, and xdu to visualize it. Here again though, we have to take care not to be misled. Results of running the normal 'du' command on a tree backed by an ext3 filesystem and considerably different to those obtained by considering the actual JFFS2 block usage. Thus another tool was written to analyse the output of the mkfs.jffs2 command and generate a report showing per-directory usage in a format compatible with the normal du command. For example, it is much more important to trim directories containing binary files such as images, than directories containing text files, since the latter compress very well. Compare a report on disk uage in ext3:

Directory disk usage on JFFS2

With the same files written to JFFS2:

Directory disk usage on JFFS2

If you have xdu installed, you can view the original datafiles for JFFS2 and ext3 directly.


Last updated on Monday, Feb 6, 2006