XPfree - Brief guide to data safety

But first, a brief note:

Just a note to all beings who use computers to do things.
Do back up your data.
If your data is important to you, don't assume it is safe.
If you get a cryptolocker (ransomware) virus, your data is gone (or at the least, very expensive).
If your hard drive in your computer fails (mean lifetime of current hard drives is 6 years, 25% fail within first 4 years), your data is gone.
If the motherboard or power supply in your computer starts to fail, your data may be gone.
If your OS or RAM does something silly, your data may be gone.
If you get a power spike/lightning strike and it makes it past your power supply, your data is gone.
If you have a blackout/brownout, your data may be gone.
If you accidentally delete your data, your data is gone.
If someone else deletes your data, your data is gone.
If your computer is stolen, your data is gone.
If someone steps on the screen of your laptop and it doesn't have an external display port, you're gonna have to pay someone to retrieve the data or your data is gone.
If your data is gone, your data is gone. When data is gone it doesn't like to come back so much.

The easiest way to make your data not gone is to back it up to an external hard drive (~$100 a terabyte nowadays) using a free 'sync' program like microsoft's Synctoy or OZsync. A 'sync' program will only copy across new or changed data, meaning you don't have to back up every single document each time you back up.
Another way is to have your documents in the cloud using Dropbox, Google Drive or Cubby (all have free accounts as well as paid). Best solution is to do both, but it depends on how important your data is to you. Be wise, be smart.

To understand data safety from a computer perspective, you need to understand 4 concepts:

  1. Power
  2. Backup
  3. Redundancy
  4. Archiving

Power

Good clean power and protection from power failure/spikes/surges/brownouts is your first level of defence against data loss. With that in mind, you ideally need:

  1. A UPS. A UPS or uninterruptable power supply stores a certain amount of power in internal batteries and protects against data loss from brownouts and blackouts. The better units also filter power and protect against power surges and spikes.
  2. A zap-catcher or spike protection unit. These are basically designed to protect against the kind of sudden spikes which can easily overwhelm a computer & sometimes a UPS.
  3. A decent PSU. The PSU your computer has may be a good one or a bad one - the questions you need to ask yourself are:
    1. is it from a respectable brand (e.g thermaltake, acbel, corsair, etc)?
    2. is it in the 'budget' (ie. inadequate) range of that brand?
    3. can you find reviews of it which test the power output?
    4. is it supplying enough power? PSU Calc lite is a good way to determine this, but you need to remember that your power supply should be able to supply twice as much power as PSU calc states in order to cope with power fluctuations and to stay within moderate ranges. If your PSU is being pushed to it's limits 90% of the time, your system will be unstable
    Generally speaking, the more expensive and higher-wattage the PSU - the better it is likely to handle systems (but not always!). Google and read some reviews on the power supply you have, or the ones you're looking at getting - check the efficiency at 70 and 80% loads - they should be able to hold up.

Backup

Backing up data is a time-delayed replication of data, or should be. If it is instantaneous it is useless. The entire point of backing up data is so that, if, in the moment, you accidentally delete a file or develop system corruption, then the data is present elsewhere. If backing up is instantaneous and not time-delayed, then any changes such as a file deletion or otherwise are reflected in the backup, rendering it useless from a data recovery perspective.

The particular time-delay you choose for backing up data varies. Most businesses backup once a day, but for home users once a week is usually sufficient. That way if you lose a project during the week it's always there on another machine. Better backup systems allow multiple time-savepoints, so that changes to data can be easily tracked and fixed if required.

It is important to separate your Power for your main machines and backup machines. THat way, you double your chances of data protection in the event of a surge/spike/blackout/brownout. ie. if power protection on one setup fails, the other may have survived, but if they are both on the same channel, both may burn. Actually though, the best thing to do is to keep your backup physically disconnected from power when not being used.

Redundancy

Redundancy is not the same thing as backing up data. Redundancy is instantaneous - backing up is time-lapsed. They serve different purposes, but both within the context of data loss. Generally within computers a certain level of redundancy can be achieved by having 2 or more drives in a RAID 1, RAID 5 or RAID 0+1 configuration. This covers against one of the drives dying. However, if you accidentally delete a file, or experience file-system corruption, redundancy does not help you - that's where time-lapsed inter-computer redundancy (ie. backing up) helps.
Redundancy applies equally to active storage (ie. what you're working on now), backed up data, and archival data. For example, you could have a Raid 1 setup on your main machine, and a raid 1 setup on your backup machine, or in your archives.

Archiving

Archiving achieves the safe storage of non-active data or projects. It can be regarded as separate from backing up as it's predominantly for non-active as opposed to active data, so the storage methods can be permanent (ie. read-only like CDr/DVDr) if desired, and once-only as opposed to recurrent. It's forms include optical disc storage (cd/dvd/bluray), tape storage (though less common nowadays) and external computer/hard drive storage. Generally if the data is important, a mixture of two or more forms of archival are good for data safety purposes - for example, archiving on both cd and hard drive, or two different types of DVD media. This achieves a certain amount of redundancy and also allows for more safety where one particular media might be faulty or have a shorter life-span (these things are typically difficult to accurately predict in advance).

Archiving, in digital terms as well as physical, is an ongoing process - DVDs/CDs will always age, particularly if they are used, and hard drives die and/or become obselete in terms of their interfaces. So long-term intermittent (say every five to ten years) replication of data is a necessity for ongoing archival data safety. For most home users, 10 years is a lifetime and most of your information is going to be redundant by that point, but for some people, keeping songs, images and documents alive for future usage is important.



There you go, hope you enjoyed my little rant. I haven't gone into fire hazards, flood prevention or any of those things because they are way beyond scope, but hopefully I've given you enough information to get you started on some data safety habits and structured thinking.
All advice given without guarantee, as always - use your brain - if anything dies/fries/stops/explodes, see a doctor (but don't talk to me).
M@


Back to the main page