This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Best way to get quickly access about 300gb of data? mount drag/drop too slow

I've setup a cross country replication (from West coast core to East coast core) of 4 machines.  (physical machines NOT virtual).  I'm testing Virtual Standby as a method of booting up one of the machines so I can get to the Exchange EDB's on cutover night.  There is about 300gb between 2 EDBs.  I'll need "quick" access to that data so I can import the mailboxes into my East enviroment.  I can't wait 5-6 hours to mount the drive and copy the data out.  

Theory is, I boot up the Virtual Standby, and can much more quickly get to the data.  BUT I had a new theory; why can't I mount the VMDK as a drive on another VM.  DONE.  I tried that and Windows didn't like the drive.  "Invalid" or something like that. 

So, my question; what are the methods to quickly get to backed up data? (when it's about 300+GB)  simple file restores are quick, 300gb is NOT quick when using traditional mount and drag/drop.

 

thanks!

  • Rapid Recovery performance depends most and foremost of the IOPS available on the repository storage system. This being said, the fastest method to get the data out from a repository is to do a restore to another protected machine. Specifically, I would use a machine with good IOPS storage, enough memory and CPU power and protect it in Rapid Recovery. No backups are needed. I would add a disk (or a volume of over 300GB). On the Core I would navigate to the recovery point that I need to restore and run the restore wizard. Please see below:

     

    Hope that this helps. Please let us know how it goes.

  • ok, cool. I didn't know you could restore a disk to a different protected machine. I may restore to my win10 workstation as it has a fast SSD and plenty of horsepower, first I'll add it Rapid Recovery. thanks, I'll let you know.
  • Please note that you restore a volume and as such you need a spare disk or a partition same size or larger than the original volume. Thanks for keeping us posted. :)
  • one of the disks is Dynamic and will be converted to Simple. I guess that adds some time? Yes, I just figured out he destination storage requirement!
    2nd question; what exactly does the restore do: create a folder on the destination drive called something like "e:\restored_data" or does it overwrite the destination drive??

  • It recreates the destination drive, block by block and sector by sector. Among other advantages, this means that the permissions are preserved.
  • Sorry, to be a pain, but It sounds like you are saying it DOES wipe the destination. So my best bet is to create a d:\ partition and restore to that. I was about to restore to my C:\drive. what would have happened?!?!?!?
  • It would not allow you to restore to the System volume. However, it is not uncommon to have valuable data spread over multiple disks :) . Creating a partition 300GB or larger and using it for restore would do, though. Obviously everything on that partition will be wiped out.

  • Tudor.Popescu is extremely inaccurate and the Rapid Recovery Development Team will back me here... While IOPS are a factor, its an EXTREMELY small factor due to how RR's repositories handle data. My company specializes in provided customers the best possible RPO/RTO available for their buck, down to virtually seconds for some customers, and i can tell you AppAssure isn't capable of this by a long shot.

    Rapid Recovery has an approximate max throughput max of around 125mbps on a repository, and mind you this is on a system that is capable of writing at multiple GBps to Long Term Archive SSD's. You can throw the fastest disks in the world at Rapid Recovery and it will not be able to use it. The average 12 SATA Disk system or the DL4000 appliance will recovery at approximately 30MBps and is pretty much maxed at that. While some may be okay with this, others may have problems with this speed.

    It's highly suggested Virtual Standby be used, or the proper recovery process of BMR/EXPORT the Operating System Disks, and then live recover the remaining disks, or look for a backup product without Rapid Recoveries limitations.

    As for your virtual standby solution Jordanl, you can technically do what you're trying to do, just keep in mind that it will likely screw up your Rapid Recovery virtual standby instance.

  • Hi Wayne:
    Your post shows some confusion and may mislead other people as we both talk about things located at different ends of a large spectrum. As such I am in the situation to explain some issues that may be of interest for our mid-marked customers.

    Let's start with the basics. IOPS means IO operations per second, which in turn means the amount of read or write operations that could be done in one second time. Rapid Recovery works with data in blocks of fixed size of 8KB. (Please note that backup data is deduplicated BEFORE being committed to the repository).
    As such, the amount of deduplicated data reaching the repository is related to the Repository capacity of absorbing it.

    At the same time there are various operations executed at repository level such as Rollups, Deferred Deletes, Mountability checks, Attachability checks, Recovery Point checks, which read, consolidate and delete unnecessary data. These operations consume storage resources which are measured in IOPS as well.

    In a normal usage situation you have data coming in (and this takes IOPS dedicated to write) and data being processed on the repository at the same time which requires other iops dedicated to read. Since the number of IOPS a storage system is able to deliver is fixed, there is competition between the various jobs running at the same time. Moreover, both read and write input/output operations are performed in a random manner which as everybody knows, diminishes the overall storage system performance.

    As such, in most cases, the storage system is the main bottleneck in Rapid Recovery performance. The "Highest Active Time" in Windows Resource Monitor condenses the whole story into a number.

    To complete the picture, here are two more points to make.

    First is about ingesting data. Basically, how quickly can data backed up from the protected machine reach the repository. There are three elements here:
    1. The load on the machine to be backed up -- this pertains to available IOPS on the volumes to be backed up, CPU and memory load.
    2. The health/speed of the network connection to the core (by default the transfer is made over 8 streams; obviously jitter and dropped packets do not help).
    3. The in-line deduplication process. There is a balance between the deduplication speed, the size of the dedupe cache and the data that is actually committed to repository. To explain, a part of the incoming blocks are similar with the blocks already present in the repository and are dropped before hitting the storage .

    The most common bottleneck in this process is the load on the system to be backed up (including available IOPS). Everything else is normally much faster.

    Second, there is the replication process. I won't go into detail as some of my previous explanations do apply. Suffice to say, this is the only typical case where the bottle neck is related to the connection speed.

    Another point I did not make yet is the AV protection. As everyone knows, all data traveling to and from the storage system is intercepted by the antivirus filter drivers (even if the AV software is disabled). Most AV filter drivers are not designed to deal with a large amount of data and may create problems and terrible bottlenecks. Applying some exclusions may help.

    These being said, let's look at numbers. Many times customers see replication speeds that are higher than the theoretical network speed. For the record, the network speed is measured in Mb/s (Megabits/s) and the Transfer/Replication speed is measured in MB/s(Megabytes/s). As such, you need to divide the network speed by 8 to express it in MB/s.
    To go back, many customers with slow WAN connections -- for instance 8Mb/s (1MB/s) -- see replication transfers of 1.5 - 2 MB/s. This is due to having deduplicated & compressed data transferred through the wire.

    In your post, Wayne, you say the Rapid Recovery has a max transfer rate of 125mbs -- if you mean 125Mb/s -- this would be ~13MB/s which is obviously not the case. If it is 125MB/s which I believe it is, that is the theoretical speed of a 1Gb/s network connection. Taking into account that depending on the network specs, TCP/IP needs 5-10% overhead, it does not look that bad.

    In this case, since the bottleneck shifted to the connection side, increasing the available IOPS that the storage system can deliver will not improve performance.

    Now, regarding "system that is capable of writing at multiple GBps to Long Term Archive SSD's". There are many great storage systems available on the market. Some of them -- the most performant ones -- are specialized for specific tasks, some others are designed for general usage but at a lower performance level.

    In your case, based on the description you provided, it looks that it is a specialized system designed for archiving. This is very different of what a Rapid Recovery Repository needs. For instance if your system is supposed to ingest data through flash serialization (as Nible Solutions do), and most likely it is not optimized for random reads, it won't really help improve performance dramatically :). Optimized archives are supposed to be contiguous. Remember, Rapid Recovery works with 8KB blocks. Even if your system is optimized to wok with large blocks, it won't help.

    The DL4000 was a splendid machine in its time as it balanced well the ratio price/performance. If properly updated and upgraded it still works great. Beside the local storage, depending on the license type, it is possible to add one or two MD1200 storage enclosures.

    I fully agree that replacing its original drives with 15K SAS drives would improve general performance -- which boils down to IOPS. Please note that DL4000 and later appliances, use RAID 6 -- which adds to the read/write penalty -- to avoid disk punctures.

    As a recovery strategy, you are right, it makes sense to recover the system disk and then do a live recovery for the data volumes. Since the typical System Disk size is around 300GB (and seldom all of it is used), assuming that the drivers for dissimilar hardware BMR were prepared in time, it takes only minutes to recover the System Drive and then, due to live recovery, the users can use the files and applications while the rest of the restore is still running.

    Hope that this helps.

  • thanks for the very detailed response, I really appreciate it.

    So, it turns out I need about 300gb from a 900gb drive. so the restore the whole volume method takes about as long as just mounting and dragging the data I need out. none of this restore has to be live.. all users will be offline.

    **I'm exploring mounting the Virtual Standby vmdk as an additional HD on another VM, should that work right??, but so far the disk shows up as Not Initialized.** the VS VM is from a bare metal system with ZERO drivers for vmware, it won't boot. I'm sure that's another topic.

    My current Repository storage subsystem in a 4 drive sata(maybe sas) 7200rpm raid6. So, pretty weak. I should increase rpm and disk count. but $$.
    My current drag and drop rate of a Mount is from 5/MBs to 15/MBs. does that sound about right? If I delete all recovery points do you think this might speed up? is there a way to "optimize" the repository?
    core stats: 12gb ram allocated to the Rapid Recovery core, about 3gb being consumed by the core processes... low cpu load during drag and drop. I feel like I'm maxing out MB/s given my low disk/rpm count. correct?

    lot's of questions, but good info for most users from this thread. thanks again