ExtraBit Software Logo

ExtraBit Software Article - Why duplicate files are a Bad Thing

Quick Links


Free Software


Lost Registration Codes

Releases Mailing List

Why duplicate files are a Bad Thing

Mark Williams
September 3rd, 2005

This article explains why duplicate files are a Bad Thing, and should be avoided!

  1. Duplicate files waste disk space

    This is the most obvious reason why duplicate files should be avoided. It often seems to be the case that no matter how much hard disk space you have, it always fills up! Often, large amounts of disk space are wasted storing duplicate files.

  2. Duplicate files waste time

    Less obviously, duplicate files can also waste time. This can happen when searching for files. You might think that having more copies of a file will make it easier to find, but the problem comes when the duplicate files are not the ones you want to find, but are similar to the one you want. In this case, you can waste time sifting through many more files than is necessary.

    Another way that duplicate files can waste time is when you start making changes to duplicate files. If your hard disk is disorganized, this can mean that when you search for a particular file you may not find the same version of it each time. This can mean that different edits get made in different copies of the file. Eventually this can mean you waste a lot of time merging different changes from several versions of the same file. If the file had never been duplicated in the first place, this extra work can be avoided.

  3. Duplicate files can lose data

    This may seem counter-intuitive, but duplicate files can end up meaning you lose data. Again, this can happen when your hard disk is disorganized and you have got into the habit of having lots of copies of the same file. The risk is then that you might delete a file thinking that there is another copy when in fact there is not!

The one time when duplicate files are justified, and are in fact of vital importance, is when they are backups or archives of original files. They key to keeping duplicate files under control in this case is to have a clear distinction between the original files and those that are simply backups.

Probably the simplest way to keep track of which files are just backups and which are originals is to have a folder called Backups and make sure that all backup files are created under this folder.

Alternatively, for some files it may be more useful to have the backups in the same folder as the original file. In this case a naming convention can be used to make it clear which is the original and which is the backup. One good naming convention is to name backup files with the date when the backup was made appended in the format YYYYMMDD (i.e. year, month as 2 digit number, date as 2 digit number). For example, the file name.ext could have a backup called name20050903.ext (for a backup copy of the file made on September 3rd, 2005). The advantage with this naming convention is that if you end up with multiple backups of a file going back through time, then the backup files will appear sorted in date order when a normal alphabetically sorted listing of the folder contents is viewed.

Of course, your backup strategy shouldn't just rely on having copies of the files on the same disk as the originals, since one of the most likely reasons you will need your backups is when your hard disk fails!

Using SpaceMan 99 to re-organize your files and eliminate duplicates

If your hard disk has become disorganized and contains lots of duplicate files, then SpaceMan 99 provides the tools you need to not only find these files, but to easily and safely remove the duplicate files.

One approach to sorting out a disk that has become disorganized over time is to start by creating a new folder structure that is well organized. Files can then be gradually moved from the disorganized folders to the new folder structure. One of the really powerful facilities provided by SpaceMan 99 to help with this process is the ability to automatically find and delete all duplicate files that are outside a given folder hierarchy. To use this facility, you simply need to use SpaceMan 99 to scan both the old disorganized folders and the new well structured ones. Select the top folder of the new, well-organized folders, then select the command to Mark Duplicate Files, and then select the option to mark those duplicate files found outside the selected folder. The files found will those in the disorganized folders that are simply copies of those that have already been moved to the new structure. You can then use the Delete Marked Files command to delete all the files in the disorganized folders that have copies in the new folders. Any files left in the disorganized folders will then be those that have not yet been moved to the new folders. This process can be repeated until no files are left in the disorganized folders.

By using SpaceMan 99 to help with this process you can be sure that you will never accidentally delete any important files. You should also end up with a well-organized folder hierarchy that does not contain any unwanted duplicate files.