Today, it’s commonplace to see news reports of security breaches. Safeguarding your company’s data is of utmost importance. Yet building and enforcing protections can be a challenge when there are so many employees who require privileged access to your firm’s most critical data.
This practical step-by-step guide from Beyond Trust will help you quickly make progress to protect against data breaches and discusses:
- Controlling user privileges within your environment
- Reducing the surface area against which you can be attacked
- Building a privileged access management strategy
It’s well recognised that the internet of the future is going to have to cope with content that is much more bandwidth-intensive, and some cloud and co-location services may not be able to keep up.
Edge Computing might be a solution to help companies prepare now to avoid network congestion and poor performance issue in the future. We recommend this research paper from Tech Research Asia, which covers:
- An explanation of what Edge Computing is and is not
- The technical reasons and business drivers for adoption
- Examples of possible applications in the future
- A checklist of strategy questions to help prepare your organisation
Digital Transformation is an important agenda item, yet few organisations make it a top priority. A white paper from Tech Research Asia and Hitachi will help you move from just thinking about it to actually implementing it.
The paper will guide you through a Digital Readiness Health Check and explain how to take digital transformation to the next level by:
- Deepening your digital readiness
- Overcoming internal culture that prevents digital projects
- Identifying great digital successes in your industry and applying those lessons to your processes
- Implementing digital projects that are genuinely transformational
Your privileged accounts are a hacker’s favourite target – 62% of breaches result from a lack of privileged account management
Privileged account passwords for domain admin accounts, root accounts, superuser accounts, and more, are the preferred targets for hackers these days. Why? Because they give them the “keys to the kingdom,” allowing them to gain access to your most sensitive and critical information resulting in data loss, identity theft and more.
With the daily workloads experienced by many IT groups it is easy to overlook the effective management of these key accounts. People leave the organisation or contractors finish their work and despite the very best intentions, these credentials remain active making it easy for someone to gain unauthorised access. Similarly, potential security risks can manifest themselves within the IT group where a number of staff members have to access particularly secure aspects of the network. Despite the risks involved in distributing these logons, it is not uncommon for unqualified staff to gain access to them.
In many respects, leaving these privileged accounts unmanaged is like having the world’s most secure bank vault but leaving the keys on the table!
So what can be done to protect against privileged account attacks?
There are a number of steps to bring the privileged accounts under control both as internal actions as well as utilising software to provide an ongoing management platform. Some simple and quick steps can make a significant improvement to security:
Discover and Secure
Deploy a network tool that can identify all active accounts, the access history relating to them and their relevant access levels. Once identified, each account can be validated and either deleted if no longer required or brought into a centralised and secure privileged account management application.
Enforcing Least Privilege and Application Whitelisting
Removing Administrator privileges or superuser privileges from users, safeguards employees from malicious software. Application whitelisting allows organisations to analyse software prior to making it available and with the minimum privileges needed to perform specific tasks, checking whether an application comes from a trusted source, enhances system security controls, and alerts security analysts to suspect requests.
Protecting Password and Privileged Account Access
Implementing effective security controls over these powerful accounts can differentiate between properly defending against a simple perimeter breach or experiencing a cyber catastrophe. Companies should routinely provide suitable training for employees on best practices for password choices. Insecure password habits often occur when a very complex and difficult to remember password is required. Storing passwords in a secure vault, and using automated password management software can mean the difference between a single system and user account being compromised, including the organisation’s entire computer system. Organisations need to continuously audit and discover user accounts and applications that provide privileged access, and seek to remove administrator rights where they are not necessary.
Keeping Systems Patched and Up-To-Date
Another key security control focuses on continuous security patching of applications and operating systems. Keeping all application and operating systems security updates current, will significantly reduce the risks from outside attackers and other malicious intrusions. Minimising privileged credential risk, limit user privileges, and control applications on endpoints and servers will significant reduce the chance of exploitation of company systems and data.
Ongoing Security and Management
To provide an effective platform to manage these privileged accounts, Perfekt partners with Thycotic, a global leader in IT security that provide protection against cyber and internal attacks. Thycotic’s award winning Privilege Management Security solution, “Secret Server”, minimises privileged credential risk, limits user privileges and controls applications on endpoints and servers.
Secret Server provides a number of key features for effective account access management both on premise and in cloud environments:
A secure vault and password manager with Active Directory integration
Automatic discovery of local and Active Directory privileged accounts
Automatic password changing for network accounts
Enhanced auditing and reporting
CRM, SAML, HSM integration
Monitoring of keystrokes and activity relating to privileged accounts
Perfekt can deliver cost effective and efficient solutions around the management of these highly privileged accounts. If you are unsure how effectively your privileged accounts are being managed and would like to discuss how Perfekt can possibly help, please give us a call.
Commvault has iterated and significantly streamlined its licensing model which we know will delight customers with much richer bundled functionality. Called Commvault Complete, existing clients will benefit from a range of built in features that were previously charged separately. For new clients, Commvault becomes more affordable with access to the full feature set at a lower price than before.
Most Commvault clients will have experienced multiple licensing offerings in the 20+ years that the company has been in operation. Many license tweaks made previously changed the price models and most clients didn’t understand what was needed to contain the costs of licensing the platform.
Following an extensive review, Commvault’s new Complete licensing has been simplified into just 10 part numbers (down from over 300) and is available as perpetual or subscription license structures.
As part of the next annual support renewal, an “Automatic” conversion process for a client who has both TB Capacity licensing (eg DPA/DPF) and VM/socket licensing will entitle you to a mix of both. However, at the next upgrade purchase, Commvault will consolidate licenses into one or the other (and if VM-based, then supplement with Operational Instances as applicable for protection of NAS systems or physical servers).
Migration to the new license model is completely free (that is, there is nothing more to pay) and with no loss of functionality; only the following free additional functionality:
For clients with VM/socket licensing, now full application agents are bundled with all licensing. This means that databases or applications can be protected using Commvault’s advanced agents without additional TB-based licensing.
Move to “data management” from data protection, broadening beyond just backup. Great for project content (eg engineering) with dormant or files with low frequency of access. (Mailbox archive is licensed separately). If licensed by TB after conversion, then Backup TBs can be consumed for either Backup or Archive If licensed by VM, then there is no impact on usage – Archive is effectively free.
Allows a private (on premises) metrics reporting server (rather than in the Commvault cloud) and access to a broader range of Commvault reports.
VM LIVE SYNC
For protecting larger VMs (>2TB that often struggle to be snapshotted), and also offers DR-style replication like vSphere Replication or Veeam, Zerto etc.
For sites that have a heavy reliance on tape, provides sophisticated enterprise tape management and tracking beyond built in VaultTracker tape management.
Perfekt will help all existing clients take advantage of the new license model by walking each site through this process. In the first instance, send us your current License Summary Report. For clients not using Commvault today, now is the time to consider a Commvault solution from Perfekt. A seven-year Gartner Magic Quadrant Leader backup and data management solution, Commvault’s new licensing makes great sense following these changes.
How long does a bridge or building last?Been to Europe lately? It could be over 500 years. The trouble is in Australia, we don’t often have that long term thinking. Certain regulations ask us to retain records for:
- The life of the patient (or until the child turns 18 plus 7 years = age 25)
- The length of employment, plus 7 years
- The term of the contract
- At least 7 years for financial records
- After an OH&S incident, plus 5 years
- 70 years after the end of the year of a copyright creator’s death
- And so on
- And some regulations are specified as “at least…”, with no maximum specified.
What could this look like?:
Every day you add new records to the database
You may update some records (depends on how the application handles edits)
You don’t purge old records to save space because the disk space is relatively cheap, and the application updates existing records to show an account or transaction as being closed or complete.
For protection purposes, do you:
- Back up the database daily and keep each backup for 7 years?, or
- Keep the database with historical records online, and apply data protection techniques to meet the regulatory requirements? This could take the form of:
- an occasional backup for recovery purposes
- Retain this backup until you take the next backup, with a small number of levels of recoverability
Levels Of RecoverabilityThe ongoing mistake made is over-catering for multiple “levels of recoverability” from copies made years ago. If a file such as a database is updated every day for 7 years, there would be very few circumstances where you would need to go back to a copy specifically from 2374 days ago (6 years, in case you were wondering). Most organisations only keep a monthly “archive” copy of the backup for the purposes of a just-in-case recovery as needed. Very few organisations would ever recover an application back this far. On very rare occasions they are more likely to run up a parallel copy and examine aged records. However, if the online database is retaining records all the way back to 7 years ago, then this typically would meet the record retention requirements without any backup copies. What does this mean: unless there is some hefty purging of content (by users or inside the application), there is no need to retain backups for such a long period. What about files in a file server, where there are ongoing updates and even deletions of files? Management of whole files is actually more challenging that for records in a database. Enterprise backup software applications have an archive module for exactly this functionality. If you intend to purge entire files from the active primary data store, for example to relieve pressure on backup times, but also on primary storage usage, then the best option is to archive this data; usually to a much lower cost tier such as nearline disk or tape, or possibly to the cloud. For compliance purposes, to retain a copy of a file, some type of protection scheme must be in place. This could be a backup, archive, records/document management system, or repository.
Data Classification and RetentionData Classification and Retention
A 2012 study across a broad range of industries showed that:
So what would be a simple and effective backup retention regime?Assuming the deep purging of in-database records does not occur, then your life can be quite simple. Here’s an outline of sound practice:
|“Current” data set|
|How long to keep:||Determined by restore profile. If not sure, 1 month is a good benchmark|
|Databases frequency of copies:||Could be backed up hourly (or in extreme cases every 10-15 mins). High frequency database backups would have a single backup nominated as the retained representative daily backup; where other intermediate copies are expired within a small number of hours.|
|Files frequency of copies:||Usually daily (perhaps augmented with a snapshot regime on the file server/NAS).|
|Method:||Archive and replace with a tiny stub file, (or completely remove from primary storage – but less common)|
|Frequency of copies:||Monthly, or perhaps fortnightly|
|Archive candidature:||Typically older than 24 months, but can be lower depending on data access patterns|
|Where to keep the backup and archive copies:||Commonly nearline disk, as this allows for immediate and easy recovery at a low cost; but also tape is highly economical especially for vast data volumes. Cloud is also an option and many modern data management platforms have strong cloud interfaces.|
|Archive protection:||Archived files should be retained in at least 2 places: disk and tape, or 2 disk copies in different sites to cater for a disaster event|
When aged data grows to an unmanageable scale, how should this be managed?Some data, especially voluminous data sets including geo-seismic data, clinical records such as a pathology tissue scan or x-ray, or files from a complex engineering project, may need to be retained for many years. Clinical records often need to be retained for the life of the patient. Engineering drawings can provide great insight into a building for refurbishment or later demolition. The life of a building or a bridge could be in excess of 100 years. How do we retain all of this data, and where is it best kept, if it grows to 50 or 100TB (or more) over the life of an organisation? Keeping such aged data on the primary storage disk array is uneconomical, especially new and expensive all-flash arrays. This is ever more important if the access rate is low. Relative likehood of access:
- WORM (write-once, read many)
- Version control
- A minimum of 2 copies of every file to cater for corruption
- Self-healing from corruption
- Geographically dispersed copies and replication, to cater for DR
- Therefore its data doesn’t need backing up – because of the features above
Are you one of those super-paranoid organisations who have decided to keep all backups forever?Then you are certainly not alone. The vendors will love you; however you must consider how you will handle technology obsolescence. While the shelf life of an LTO tape cartridge is 30 years if stored under ideal conditions, the chances are that in 15 years you won’t actually have the hardware or software technology necessary to recover the data.
- Focus on data retention, not backup retention
- Understand your restore profile and work out how many levels of recoverability are needed – it will be less than you think
- Investigate whether your applications actually retain most historical data online (likely!)
- Aggressively retain fewer backups in line with the points above.
Hitachi Data Systems has announced the winners of its Australia and New Zealand Partner Awards where Perfekt were awarded Platinum partnership status and national partner of the year.
The awards were presented during HDS’ partner summit held on the Hamilton Islands on 22-24 November 2016.
Many years ago, before most backup products had backup-to-disk options, vendors launched Disk-based appliances, known as Virtual Tape Libraries, as a backup target. These enhanced the backup and restore experiences over tape. Over time these appliances were enhanced to provide deduplication in the box. This supplemented the lack of such features in some backup solutions.
Time has of course moved on and products such as Commvault has had backup-to-disk as an inherent part of its architecture since inception, and has supported deduplication for many years now.
So the question becomes: What is the best way to perform deduplication? Within backup software, or with an appliance?
Appliances seem easy, as you don’t need to consider the deduplication within the backup software, just buy a box with 3x of the capacity of your data being backed up and you might get a month or so of backups on disk. The backup software sees it as a generic disk target.
However, it really isn’t that simple, and there are a number of factors often overlooked when contemplating this approach. This blog article is designed to flesh these out for you so you can make a considered decision. These are broken down into a few distinct categories:
A dedupe appliance is a self-contained unit and generally can’t be messed with by outside factors. For some users this “black box” approach is simple, but has a number of notable downsides.
They perform the deduplication task far too late in the data path to be useful. Deduplication is done at the very end of the data movement process. Software dedupe (eg. by Commvault) is at the start of the data path. Nothing leaves the client computer unless the Media Agent confirms that it is needed. With a dedupe appliance all of the data has to be sent by the client across the network and be processed by the Media Agent. It then has to be sent out of the Media Agent to the appliance, at which point it’s deduped. There is no aid to backup performance when you still need to move all of the data.
When one compares a typical amount of data that is sent out of the client using Commvault dedupe – somewhere around 2% to 5% daily is quite normal, against what gets sent across the network when using a hardware appliance – 100% – there is really no contest. Commvault dedupe saves significant amounts of two things: time and space. Time is saved because very little is sent across the network from client to Media Agent, and space is saved because only a single copy of everything is stored. A dedupe appliance saves space but no time at all because of the 100% data transmission.
Dedupe performed at the client (like Commvault) is “content aware”. This means that for every item it is backing up, it re-starts the alignment again.
For example, if a system has file 1, file 2 and file 3, and a user edits file 1 and makes it bigger, this won’t disrupt the deduplication for file 2 and file 3, because when it’s finished with file 1, the agent will open file 2 and start again at the beginning of the file, so everything lines up again. A dedupe appliance has no idea what is going on in the client and is not content aware. Commvault (and other backup products) will write big chunk files to the appliance, and these don’t dedupe very well once some content has been edited as described. Today’s chunk files don’t look much like yesterday’s, so for that reason an appliance simply cannot get the same sort of space savings that deduping at the source as Commvault does. No dedupe appliance can achieve dedupe savings of 98% or 99%, and this is often achievable with Commvault dedupe.
Note the Size of Application (original data size, at source), the Data Written (data sent across the network to the Media Agent), and the Savings Percentage – a whopping 98%! Only 2% of the data was sent across the network and only 2% was stored in the disk library. No dedupe appliance can achieve ratios like that.
The picture gets murkier when considering multiple sites and DR protection of backup copies. Such site-to-site copy is only able to be performed efficiently when you have purchased two such dedupe appliances, Commvault cannot participate in assisting this process as it is handled in the back end by the appliance and Commvault is (mostly) unaware that the copy has been made. Now you have hardware vendor lock in for two sites that must both be upgraded together.
Often, in such circumstances you have limited control over the mirror site policies. Commvault solves all of this with DASH copy, where the hardware can be dissimilar and your retention policies can be as varied as you like: keep some jobs for 3 months at Prod and 1 month at DR, keep other jobs for 3 months on each site, etc. As granular as you might need.
Since dedupe appliances only see what data is sent to them, large and complex sites with a mix of backup and archive data are not able to take advantage of global site dedupe. Every site, no matter how big or small, would need the same vendor’s dedupe appliance and this get inordinately complex (not to mention expensive) when trying to coordinate large scale fan-in of remote site content. It is simply not feasible. Contrast that with Commvault software-based dedupe you can protect a single desktop right through to a massive NAS device of >1PB and dedupe will work for all enterprise data, backup and archive, with cross site data management handled efficiently and seamlessly.
Despite purchasing the dedupe appliance, you will also need to consider the licensing obligations of the backup software itself. Within Commvault, depending on your licensing model, in the early days you needed to license the total addressable (effective) capacity of the dedupe appliance. So if it had 40TB of usable disk and with deduplication offered 150TB effective space you would need to purchase a Standard Disk Option license for 150TB.
With the newer Commvault licensing schemes such as a Capacity License Agreement and the VM protection Solution Bundles, your license already includes Commvault deduplication capability! Why would you want to go out and buy another/different solution when you have paid for it? 95% of backup systems are configured with deduplication as it is the market expectation. It makes no sense to bypass that and buy an appliance to do the same thing.
Once you have Commvault deduplication licensing, there is no more to pay for back end dedupe capacity expansion, just the cost of the disk. Dedupe appliances will cost more than generic JBOD disk, thus you are paying more than you will should.
Related to this, is that a dedupe appliance locks you in to a specific HW vendor. Your upgrades need to come from them. Your second site appliance must be the same, so that must also come from them.
Further, capacity upgrades to the dedupe appliance are limited to what the vendor offers and some boxes are restrictive in capacity options. With Commvault software dedupe you can add JBOD after JOBD separately (even from different vendors) and therefore not be concerned with the Library device itself. You are free to choose any vendor Media Agent server, so long as the specification meets the needs for performing deduplication according to Commvault guidelines. If your shop has a preference for HP, Dell, Cisco, IBM/Lenovo – then you can stay with that choice.
All of the points above add up to a total cost of ownership for dedupe appliances which can only be more expensive than using the Commvault deduplication you probably have already in your environment.
- Earlier in the data path = faster backups, less user impact, less network traffic (thinner pipes)
- Software implementation = regular JBOD, no vendor lock-in with proprietary algorithms and hardware
- Client aware = more efficient deduplication
- Multiple site protection = easy implementation, allow for cost-effective tapeless DR
- Remote copy policy flexibility = limit disk capacity to rules that follow a business retention process, not one mandated by vendor design
- Global dedupe = corporate benefits
- And don’t forget that dedupe appliances may also have a Commvault license obligation
October, 2015 – HDS has awarded Perfekt their Regional Partner of the Year award at the HDS Partner Summit in Noosa, QLD. This award is on the back of Perfekt winning HDS’ Solutions Partner of the Year in the previous year.
In his presentation, Phil Teague, HDS Industry & Alliances Solutions Director congratulated Perfekt on their massive growth over the last 12 months especially considering their operations being only in Melbourne and Perth. The award was presented to the Perfekt management team by HDS Vice President and General Manager for ANZ Nathan McGregor.
Let’s face it, this topic has been in the back of everyone’s thinking for quite some time, yet few organisations of scale can achieve it. Tape has been around since the 1950s when pioneered by IBM to be a low-cost offline, and portable storage medium. In the last 65 years it has seen significant transformation with the market fairly singularly centred on the LTO Ultrium cartridge format.
LTO-6 is the current generation offering roughly 5TB of compressed data per cartridge, with a roadmap that extends to LTO-7 in October 2015, and LTO-8 which will see this increase even further over coming years.
The reality is that, since I worked at Quantum between 2000-2007, there has been a dramatic change in the paradigm for tape usage. Because of its portability and sequential nature, tape became the reason for people to often dislike backup. Yet backup need not be so dull!
These days, backups are staged to disk first before being copied to tape. Smart backup solutions are able to electronically copy backup content from one second-tier disk system to another, usually in an alternate site, so that the need for making regular tape copies is significantly diminished.
In CommVault’s terminology this is called a DASH copy. DASH is a horrible acronym for Dedupe-Accelerated Streaming Hash, which is about as bad as all of those terrible acronym’s IBM made up in the 1980s for their products. Forget the acronym; DASH just means FAST, and that’s what it does through only transferring new and unique sub-blocks of data between the primary and secondary copy of backup content.
This technology means that you can copy backup (or archive) data in any of these scenarios:
- From Production to DR
- From one or more remote sites to head office/data centre
- From any site to a cloud data centre
- Or all of the above together in any combination
The upshot of this is if you are copying data between disk arrays at your sites then your reliance on tape is significantly diminished.
When DASH copy is implemented, Perfekt often find clients today purchase a 1, 2 or 4 drive tape library or autoloader and make just weekly, fortnightly, or monthly tape copies which are more for archival purposes rather than traditional restore.
Because of the licensing schemes available with the CommVault Capacity License Agreement and the new Solution Bundles, clients are no longer metered on the back-end capacity of backup data stored. You can retain a day, a month, a year or a decade on disk for no additional license charge. You just need:
- The disk space to retain it
- A sufficiently large dedupe database on your media agent server
What do you need to get DASH Copy Working?
There are a couple of “considerations”. A consideration is a problem if you don’t think it through. If you plan ahead, then you will not run into issues.
The first is how to make the initial copy of data. DASH copy is incredibly efficient at moving backup content between sites. However, there is no special magic. That first copy will take some time to move. How long depends on:
- The data volume
- The network link (and how much of it you can use for this)
- A whole bunch of other “overheads”.
The devil is in the detail, so at Perfekt we have devised a simple formula to help you work this out which provides an approximation of the duration, in days, for the initial copy:
|Duration Days||Data Volume GB||Available Link Speed Mb/sec||Constant|
The constant factors in compression, TCP overhead, as well the CV Index and dedupe hash size. The following is a summary of the estimated numbers used for these factors:
- An estimated -15% allowance for the benefits of compression is given
- A +30% overhead for TCP/IP on the link speed
- +5% for the CommVault Index of the Data
- The Dedupe database creates a hash of each 128K block, which is 4K in size (+3%)
- Finally a unit conversion is made to account for data in GB and link speed in Mbps to output a duration in days
As an example a site with 500GB of data on a link with 10Mbps available would take at least 5.7 days to complete the initial copy process.
As an alternative to transferring the initial backups over the WAN, it is possible to seed the data using a portable USB-attached hard drive. In this approach, this hard drive transports the initial data set manually before establishing the regular (eg daily) DASH copy process.
Such a process however has considerable time and effort spent in handling and shipping of the drives, and as a result Perfekt would suggest to consider USB seeding if the WAN transfer time exceeds 14 days.
Of course, once the seeding is complete, since users do not rewrite entire reports, databases, presentations or spreadsheets every day. What is captured is just the sub-block changes, and these are efficiently replicated after the backup to the alternate site.
You can use the same formula as above, but take the daily sub-block change rate of between 2% and 5% of the data volume to determine the nightly DASH copy duration.
Taking our example of 500GB of data in a site with a link with 10Mbps, we could say that this has 2% or 5% of daily change. Pop that into the formula and you will see that the DASH copy duration on the same 10Mbps link is:
- 2%: 2hrs and 45 mins
- 5%: 6 hrs and 51 mins
These are certainly achievable in an overnight window.
We recommend that a minimum link speed of 10Mbps is used to support DASH copy. This ensures that it can make that first copy in sufficient time, but is also fast enough to handle the nightly copy should there be a rare occasion where something dramatic causes the change rate to be 10 or 15%. It may take a day or two to catch up. If the link was too slow, it may fall behind for so long that there is an exposure in getting the data off site.
With ongoing data growth and general system changes it is important to monitor transfer times of the DASH copies to ensure that they are completing in a reasonable time period and not lagging behind. Perfekt suggests that this is done with Aux Copy Fall Behind Alerts in console progress reporting.
Also the DASH copy summary report should be reviewed each month to monitor the overall health of the copies. This will help identify sites where greater link speeds may be required in the near future.
What if you don’t have a second site? Look up in the sky!
Not a problem. There are oodles (the technical term meaning more than you could imagine) of cloud providers wanting to have you store your backup data with them. There are two ways of storing CommVault backup data in cloud storage (I hate using the word “the cloud” assuming there is only one. The reality there are so many offerings. They are all different. Their costs are not the same and a good number will be out of business in less than 5 years).
The first way is to DASH copy to a cloud provider. This is preferred. Using this approach you would stand up a virtual CommVault media agent server in the cloud and purchase some cloud storage. The media agent is doing some hefty work, so the only gotcha here is the compute costs of virtual servers if your chosen cloud provider charges this way. It is best to not use this type of model for backup unless you pilot the process, measure the IOPs and extrapolate this within the costing model of your cloud provider.
The second way is to move data directly to some type of cloud storage without DASH copy. The issue with this is that you usually pay cloud providers per GB per month, and any attempt to push large data volumes to a cloud service without the benefit of dedupe will be unaffordable after a few years of a lengthy backup retention strategy. [It is affordable if you only want 1-6 months of content but that is not the normal business data retention cycle for most organisations, especially if you are looking to remove tape altogether. Any longer than a few years and you will quickly work out that you can buy a small tape library with LTO-6 drives and have plenty of change compared to the cloud costings].
Removing Tape – What Disk is Needed?
In such a topology, tape provides two key functions:
- A point in time complete “archive” copy beyond the longest disk-based retention period
- A copy of data as a last chance of recovery if all else fails
Because deduplication means that you can quite effectively retain many years of data copies this negates the need for point 1. Addressing point 2 is a business decision, and many sites do not have this today.
Back on point 1, there are a few basic factors that will need to be determined in order to estimate the size of disk array to retain your online backup content:
- How large is the first full copy of data: typically we see about 20% reduction due to compression and some deduplication
- Retention: for how many years you will retain backup copies
- Number of backups: eg 5 days per week or 7 days per week, 52 weeks per year
- Daily rate of change: typically between 2% and 5%, depending on the workload
The disk space required can then be approximated using this formula:
|Disk Space Required (TB)||Protected Data Volume (TB)||Allowance for compression & some dedupe||Number of backup days retained||Rate of daily change, 2-5%|
So in a site with 10TB of data with: normal 20% savings on the first backup, backups occurring 5 days per week, 52 weeks per year, online retention of 10 years, 2% of daily change; the usable disk volume required is then 528TB. Utilising 4TB nearline SAS drives, this could be accomplished in a storage array with dense enclosures in a tidy 9 rack units of footprint!
Of course this is simplified, volumes will start out smaller and grow with increased retention, and understandably there will be primary data growth and fluctuations to usage patterns over the retention period. This provides an indication of likely data capacity required.
Aren’t Spreadsheets Wonderful!
To extrapolate running costs of the required backup storage, here is a quick comparison of the disk array outlined above for the second (remote) site copy of the data, retained for 10 years:
|On-premise/co-lo high density storage array, 528TB usable Purchased up front, 10 year vendor support, inclusive of running costs||Cloud storage based on:
Ingest Tier of $0.0259GB / month
Storage Tier $0.012GB / month
Incrementally growing over 10 years
Compute (to run Media Agent) $1.169/hr
|$334K ex GST||$393K ex GST
Does not include costs for retrievals, and retrievals will be “problematic” at best, only to be required if all else has failed.
So, not a great deal in it when you factor this over a 10 year period; but useful to benchmark the differences between the available options. Of course, this is to simply protect 10TB of data without taking into account its own growth due to new workloads etc. The operational note on retrieving data is important. The on-premise storage will be very simple for restoration, where the cloud-based storage will be very slow (“tape-like”) and only to be used in emergencies.
And if the numbers just don’t work, there is still tape
Full scale recoveries are rare and mostly restore jobs are for small data sets. Depending on data volumes, retention requirements and other business methods, we are finding today that tape is still a very low-cost way of creating archival data copies. Made once per month, for example, a single or dual-drive LTO-6 autoloader is all that is needed to push a retention copy to tape which is probably never needed, but gives surety and another process to show strong data governance.
Should any of this be within your thinking, then give the experts at Perfekt a call. We love to help with your backup strategies.