Well for beginners, “de-duplication” simply means NOT storing multiple copies of data. Say you have excel sheet and you mail it to your colleagues. You would have 10 copies of the same data. The story does’nt end here, your file server is backed up everyday as is your mail server. So you backup 11 copies of the excel sheet.
De-duplication means storing as many few copies as possible.
While some companies have solutions to de-duplicate data as it is backed up and some companies have solutions to de-duplicate data after it has been backed up. And now we await something that helps us de-duplicate data at the source itself. Not that such technology is not available, but the fact that we have’nt actualy deployed and acted on it. Today my Veritas command central reports duplicate files across my storage boxes, if I were to spend de-duplicating them, you would imagine whether I would I have time to spend with my family. I could think of the FAN {File Area Network} from Brocade, the tapestry solution, also some kind of de-duplication technology.
To summarize, I would support de-duplication at source, simply because I’m saving more resources. De-duplication on target would be easy to manage and faster solution.
Information TechnologyOne Response to “De-duplication - at source or target”
March 3rd, 2007 at 11:44 am
test comment using the mobile interface