Size vs. Size on Disk: Understanding the Difference

Understanding Size vs. Size on Disk
Typically, the reported 'Size' and 'Size on disk' values for a file or folder are nearly identical. However, significant differences can sometimes occur, leading to confusion.
This SuperUser Q&A post delves into the reasons behind these discrepancies and clarifies the underlying mechanisms.
What Causes the Difference?
The 'Size' represents the actual amount of data contained within the file. Conversely, 'Size on disk' reflects the amount of storage space the file occupies on the storage medium.
Several factors contribute to the difference between these two values. These include the file system's allocation unit size, file compression, and sparse files.
Allocation Unit Size
File systems divide storage space into allocation units (also known as clusters). Even if a file doesn't completely fill an allocation unit, the entire unit is reserved for it.
Consequently, the 'Size on disk' will always be a multiple of the allocation unit size, potentially exceeding the actual file 'Size'.
File Compression
If a file is compressed, its 'Size' will be larger than its 'Size on disk' because the compressed version takes up less space.
The degree of compression impacts the magnitude of this difference.
Sparse Files
Sparse files contain large sections of zero-filled data that aren't physically stored on the disk. Instead, the file system records that these sections are filled with zeros.
This technique significantly reduces the 'Size on disk' while the logical 'Size' remains substantial.
Where to Find More Information
The original discussion on SuperUser provides further insights and detailed explanations. You can find the complete Q&A session here.
This resource offers a comprehensive understanding of the factors influencing file size reporting.
Understanding the Discrepancy Between 'Size' and 'Size on Disk'
A SuperUser user, thelastblack, has inquired about a significant difference observed between the 'Size' and 'Size on disk' values for a folder located on their Android phone’s SD card.
The User's Observation
The user highlighted a substantial disparity between these two metrics, as illustrated in a provided screenshot.

They acknowledge that 'Size on disk' typically exceeds 'Size' due to the allocation unit structure in operating systems like Windows.
Possible Causes for the Large Difference
However, the user questions the magnitude of the difference observed, speculating whether the large number of files within the folder could be a contributing factor.
The folder in question contains cached map data utilized by a maps application, which retrieves maps from Google Maps.
Analyzing the Situation
The screenshot clearly demonstrates a considerable difference between the reported 'Size' and 'Size on disk' values.
Several factors can contribute to this discrepancy, and it's important to understand how file systems manage storage.
Factors Influencing 'Size on Disk'
- Allocation Units: File systems divide storage into fixed-size blocks called allocation units. Even if a file doesn't completely fill a block, the entire block is allocated to it.
- File System Overhead: Metadata associated with each file, such as file name, permissions, and timestamps, consumes disk space.
- Sparse Files: Some files may contain large sections of zero-filled data that aren't physically stored on the disk, reducing the actual space used.
- Caching: The maps application's caching mechanism might involve creating temporary files or utilizing storage in a way that affects the 'Size on disk' calculation.
The combination of these factors, particularly the allocation unit size and the large number of small map tiles, likely explains the significant difference observed by the user.
The Android file system, while different from Windows, still employs similar principles of allocation and overhead.
Therefore, the 'Size on disk' will invariably be larger than the logical 'Size' of the files contained within the folder.
Understanding SD Card Space Usage
A SuperUser community member, known as Bob, provides insight into the issue of wasted space on SD cards.
The following explanation assumes the use of the FAT/FAT32 file system, common for SD cards. While NTFS and exFAT share similar allocation unit behaviors, other file systems may differ and are generally unsupported by Windows.
The scenario of numerous small files contributing to significant space consumption is entirely plausible. Consider the following:
- A quantity of 50,000 individual files.
- A 32 KB cluster size, representing the maximum allocation unit size for FAT32.
Consequently, the minimum disk space required would be 50,000 multiplied by 32,000, equaling 1.6 GB. This calculation utilizes SI prefixes for simplification. The space occupied by each file is always a multiple of the allocation unit size, with some unused space potentially remaining.
For instance, if each file averages 2 KB in size, the total data would occupy approximately 100 MB. However, an average of 30 KB per file is wasted due to the larger allocation unit size.
Detailed Explanation
The FAT32 file system manages file storage by tracking locations rather than individual bytes. Maintaining a byte-level list would result in a rapidly expanding table, consuming excessive space. Therefore, the system employs "allocation units," also termed "cluster sizes." The disk volume is partitioned into these units, which represent the smallest addressable blocks. This is analogous to a house number; the post office doesn't need to know the number of bedrooms or occupants.
When a small file is stored, the file system allocates an entire allocation unit to it, regardless of the file's actual size. Whether the file is 0 KB, 2 KB, or 15 KB, it will be assigned a 32 KB unit in this example. The unused portion of the unit remains allocated to the file, representing wasted space, similar to an unoccupied bedroom.
Why do allocation unit sizes vary? It's a trade-off between a larger tracking table and increased wasted space. Larger units are more efficient for large files, as a new unit isn't allocated until the current one is full. However, with many small files, a large table is inevitable, making smaller units more sensible.
Generally, larger allocation units lead to significant space wastage when dealing with numerous small files. For typical use cases, exceeding 4 KB is usually unnecessary.
Does Fragmentation Play a Role?
Fragmentation itself doesn't directly cause this type of space wastage. While large files may be fragmented – split across multiple allocation units – each unit should be fully utilized before a new one is assigned. Defragmentation can optimize the allocation tables, but it won't address the core issue of wasted space within partially filled units.
Potential Resolutions
As previously suggested, the available options are either accepting the current situation or reformatting the card with smaller allocation units.
The card may be formatted with FAT16, which has limitations on table size and necessitates larger allocation units for larger volumes (up to 2 GB with 32 KB units). According to Braiam, reformatting to FAT32 should be possible and resolve this issue.
Do you have additional insights to contribute to this explanation? Share your thoughts in the comments section. For further discussion and alternative perspectives from other knowledgeable Stack Exchange users, please visit the original discussion thread here.