Management Granularity

Much of Apple’s marketing on Fusion Drive talks about moving data at the file and application level, but in reality data can be moved between the SSD and HDD portions in 128KB blocks.

Ars actually confirmed this a while ago, but I wanted to see for myself. Using fs_usage I got to see the inner workings of Apple's Fusion Drive. Data is moved between drives in 128KB blocks, likely determined by frequency of use of those blocks. Since client workloads tend to be fairly sequential (or pseudo-random at worst) in nature, it's a safe bet that if you're accessing a single LBA within a 128KB block that you're actually going to be accessing more LBAs in the same space. The migration process seems to happen mostly during idle periods, although I have seen some movement between drives during light IO.

What’s very interesting is just how quickly the migration is triggered after a transfer occurs. As soon as file copy/creation, application launch or other IO activity completes, there’s immediate back and forth between the SSD and HDD. As you fill up the Fusion Drive, the amount of data moved between the SSD and HDD shrinks considerably. Over time I suspect this is what should happen. Infrequently accessed data should settle on the hard drive and what really matters will stay on the SSD. Apple being less aggressive about evicting data from the SSD as the Fusion Drive fills up makes sense.

The migration process itself is pretty simple with data being marked for promotion/demotion, it being physically copied to the new storage device and only then is it moved. In the event of a power failure during migration there shouldn't be any data loss caused by the Fusion Drive, it looks like only after two copies of the 128KB block are in place is the source block removed. Apple told me as much last year, but it's good to see it for myself.

By moving data in 128KB blocks between the HDD and SSD, Apple enjoys the side benefit of partially defragmenting the SSD with all writes to it. Even though the Fusion Drive will prefer the SSD for all incoming writes (which can include smaller than 128KB, potentially random/pseudo-random writes), any migration from the HDD to the SSD happens as large block sequential writes, which will trigger a garbage collection/block recycling routine in cases of a heavily fragmented drive. Performance of the SSD can definitely degrade over time, but this helps keep it higher than it would otherwise given that the SSD is almost always running at full capacity and the recipient of all sorts of unrelated writes. As I mentioned earlier, I would’ve preferred a controller with more consistent IO latency or for Apple to set aside even more of the PM830’s NAND as spare area. I suspect cost was the deciding factor in sticking with the standard amount of overprovisioning.

Fusion Drive: Under the Hood The Application Experience
Comments Locked

127 Comments

View All Comments

  • kamsar - Friday, January 18, 2013 - link

    Reliability-wise isn't Fusion Drive basically RAID 0? If it's doing block level migrations and one drive dies there's nothing left.

    Sure hope you've got time machine on... ;)
  • pgp - Friday, January 18, 2013 - link

    Yes, technically I think it's more similar to a JBOD configuration, but the reliability should be the same...
    IMHO Fusion Drive is good for noobs, but I'd rather choose which files should be stored in the flash drive and which ones in the mechanical drive, know about the free space in each disk, so I'd prefer a 128GB SSD and, separately, a 1TB hard disk to a 1.1TB Fusion Drive.
  • TrackSmart - Friday, January 18, 2013 - link

    I think drive configurations like this are really needed. Maybe not for you and I, but for 99.5% of people. Even people who aren't really "noobs".

    As an example, I purchased a 120GB SSD for a family member who is reasonably good with computers. It breathed new life into a 3 year old computer and was really noticed and appreciated. One year later, the whole thing was a disaster! There are documents, music, videos, etc all over the place. Usually 2 or 3 copies of the same files on both the SSD and the hard drive. Both nearly full. It took several hours to fix the mess.

    Bottom line: Most people can't, aren't willing, or aren't well-organized enough to keep files segregated between drives. Even people who you probably think would be able to handle it by virtue of being reasonably computer literate.
  • kmmatney - Friday, January 18, 2013 - link

    I agree. I'm currently all SSD in my work laptop, but going the manual hybrid route in my home computer. Although I'm pretty organized, it is a pain to mov stuff around manually between the drives. For 2 of my kids computers, I just went with Seagate Momentus XT drives, and they've been great. Not as good as SSD, but a fraction of the cost.
  • ToniCipriani - Friday, January 18, 2013 - link

    Multiple copies could be easily avoided, actually.

    On my RAID-0 SSD + 1TB hard drive configuration, I installed Windows 7 in a way that all the profile folders (Users and ProgramData) existed on the hard drive by default, and created NTFS junctions on the SSD to redirect any old software. I never even needed to open the C Drive anymore, and all files and desktop settings reside on the hard drive automatically.

    For older machines XP should support junctions as well.

    Now filling up the drive, that's a different story. And let me guess, the browser got filled up with toolbars too?
  • Zink - Friday, January 18, 2013 - link

    Fusion drive is even easier to use than that though and it speeds up all of your programs and files as well as it can with the SSD size given. With a setup like that there are always going to be things on the HDD that get used regularly and they will never see a boost from the SSD. There is the upside of better reliability but outside that boosting 120GB or 240GB of the most accessed files seems even better than permanent segregation.
  • dananski - Saturday, January 19, 2013 - link

    I have manual HD/SSD combinations in my desktop and laptop, have done the same for three PCs I've built for family and have similar setup for nearly every workstation at work. It seems that some users are naturally much better than others at handling their file storage, but I think it's invaluable for people to get good at organising the data systematically and consciously rather than to leave it up to an algorithm that might not have the same priorities.

    I don't like the sounds of every file being written to my SSD then moved to the HDD - I'd get through write cycles for no good reason whenever copying a file to that hybrid drive, and if my HDD doesn't have redundancy I'd feel safer with my important docs on the SSD, even if they're not deemed worthy of the speed boost.
  • seapeople - Sunday, January 20, 2013 - link

    Couldn't you just put the "Pictures", "Music", and "Videos" libraries on the hard drive and keep the documents and everything else on the solid state drive? Seems to me like that would work for 99% of people and not require any user thought... So you have a video, you save it in the "Videos" location, etc, and these files would see very little difference being on the HDD vs SSD.
  • Wolfpup - Friday, January 18, 2013 - link

    Yeah, me too. Like my shows from my Tivo are obviously GIGAAAANTIC and don't NEED to have fast access to them. Ditto (even more so) for any music or iTunes stuff. It's not like it's THAT hard to figure out what to put where, but yeah, the average person unfortunately would probably be clueless about it.
  • KitsuneKnight - Friday, January 18, 2013 - link

    It's not so hard if you can get away with just moving your music/videos/images to the HDD. It's much harder if the data you work with is absolutely massive, though.

    One artist I know has a SSD as their boot drive, and 3 HDDs. The PSDs they work with are absolutely massive, and they produce a huge number of them. Working with them on the SSD is far better than the HDDs, but even loading them on that is a bit on the slow side (as opposed to the multi-minute loads from an HDD). But, they have far too many to fit on the SSD.

    So they have the PSDs spread over the 4 drives, filling most of them up, having to manually shuffle them around. Something like Fusion Drive would work far better, as it would be doing exactly what she's doing, just without the manual effort to constantly move old files off the SSD (resulting in multiple hierarchies). The older PSDs would be migrated to the HDDs automatically. And if she starts working on an old set again, they could be promoted back to the SSD... with no user effort.

    And isn't that the point of computers? To make us do less work? It seems a lot of people want to do the jobs of a computer for them. While I'd prefer Fusion Drive to let you pin/hint files to certain drives, I'd say in most cases a 2 drive set up doesn't actually provide real world benefit over an intelligent (which I'm assuming Fusion Drive is) tiering system, even for most power users.

Log in

Don't have an account? Sign up now