Prevent data loss on Filer upgrade, deal with changes to Node layout #566

humphd · 2018-11-18T01:22:43Z

In c526445 and ee412d4 I made some adjustments to the layout of our filesystem Node. Specifically, I changed the meaning of mode from file type to be a POSIX style mode number.

As we get ready to do fix #555, I'm realizing that I don't want to break existing filesystems when people upgrade to this new code.

We need a way for a current Node layout to get upgraded to the new layout in place, without data loss.

Probably the easiest thing to do here is for me to switch type back to mode, and introduce a new property called posix_mode or something.

@modeswitch can you help me think this through before you land that? I can do the PR.

The text was updated successfully, but these errors were encountered:

humphd · 2018-11-18T01:32:10Z

Old Node layout:

function Node(options) {
  var now = Date.now();

  this.id = options.id;
  this.mode = options.mode || MODE_FILE;  // node type (file, directory, etc)
  this.size = options.size || 0; // size (bytes for files, entries for directories)
  this.atime = options.atime || now; // access time (will mirror ctime after creation)
  this.ctime = options.ctime || now; // creation/change time
  this.mtime = options.mtime || now; // modified time
  this.flags = options.flags || []; // file flags
  this.xattrs = options.xattrs || {}; // extended attributes
  this.nlinks = options.nlinks || 0; // links count
  this.version = options.version || 0; // node version
  this.blksize = undefined; // block size
  this.nblocks = 1; // blocks count
  this.data = options.data; // id for data object
}

Current Node layout:

function Node(options) {
  var now = Date.now();

  this.id = options.id;
  this.type = options.type || NODE_TYPE_FILE;  // node type (file, directory, etc)
  this.size = options.size || 0; // size (bytes for files, entries for directories)
  this.atime = options.atime || now; // access time (will mirror ctime after creation)
  this.ctime = options.ctime || now; // creation/change time
  this.mtime = options.mtime || now; // modified time
  this.flags = options.flags || []; // file flags
  this.xattrs = options.xattrs || {}; // extended attributes
  this.nlinks = options.nlinks || 0; // links count
  this.data = options.data; // id for data object
  this.version = options.version || 1;

  // permissions and flags
  this.mode = options.mode || (getMode(this.type));
  this.uid = options.uid || 0x0; // owner name
  this.gid = options.gid || 0x0; // group name
}

humphd · 2018-11-19T16:27:26Z

For reference stat(2) structure, and how they deal with backward compat:

struct stat {
               dev_t     st_dev;         /* ID of device containing file */
               ino_t     st_ino;         /* Inode number */
               mode_t    st_mode;        /* File type and mode */
               nlink_t   st_nlink;       /* Number of hard links */
               uid_t     st_uid;         /* User ID of owner */
               gid_t     st_gid;         /* Group ID of owner */
               dev_t     st_rdev;        /* Device ID (if special file) */
               off_t     st_size;        /* Total size, in bytes */
               blksize_t st_blksize;     /* Block size for filesystem I/O */
               blkcnt_t  st_blocks;      /* Number of 512B blocks allocated */

               /* Since Linux 2.6, the kernel supports nanosecond
                  precision for the following timestamp fields.
                  For the details before Linux 2.6, see NOTES. */

               struct timespec st_atim;  /* Time of last access */
               struct timespec st_mtim;  /* Time of last modification */
               struct timespec st_ctim;  /* Time of last status change */

           #define st_atime st_atim.tv_sec      /* Backward compatibility */
           #define st_mtime st_mtim.tv_sec
           #define st_ctime st_ctim.tv_sec
           };

humphd · 2018-11-19T21:58:08Z

And Node's Stats object layout:

Stats {
  dev: 2114,
  ino: 48064969,
  mode: 33188,
  nlink: 1,
  uid: 85,
  gid: 100,
  rdev: 0,
  size: 527,
  blksize: 4096,
  blocks: 8,
  atimeMs: 1318289051000.1,
  mtimeMs: 1318289051000.1,
  ctimeMs: 1318289051000.1,
  birthtimeMs: 1318289051000.1,
  atime: Mon, 10 Oct 2011 23:24:11 GMT,
  mtime: Mon, 10 Oct 2011 23:24:11 GMT,
  ctime: Mon, 10 Oct 2011 23:24:11 GMT,
  birthtime: Mon, 10 Oct 2011 23:24:11 GMT
}

modeswitch · 2018-12-03T18:55:02Z

This is a tough one. Here's my $0.02:

We should encode a file system version into the supernode.
When we open a file system, check the version in the supernode to see if we can operate on it.
If it's out of date, we may need to run and upgrade operation to bring all nodes to the latest version.

For a large file system, that will have performance implications. Maybe we want to make it optional/explicit. If it's optional, we'll have to pack older versions of filer into the module so we can continue to work with older layouts.

Another option is to version the nodes individually and upgrade them as they are read from disk.

humphd · 2018-12-03T20:15:17Z

@modeswitch agree this is hard. Can you take a look at my bandaid PR in #567? I would like to move forward, and start shipping updated versions, but come back to this to figure out a solution with versioning/upgrading like you describe.

modeswitch · 2018-12-04T01:23:10Z

Yeah, we can have the code part of the discussion in there. Regarding versioning though, some of the work in the next branch is meant to address this.

I want to make a clean split between the implementation version and any API versions. To that end, when client code create a Filer instance, it would request the type of API (Node, for example) and the version for that API (8, or 10). So it would look something like this: Filer.mount(<some device>, { api: "Node", version: "10" }). BTW, this also allows us to create new APIs that don't match the Node API, in case we want to add cool features.

The implementation version would be stored in the file system, along with any code required to read metadata from older implementations and upgrade that metadata. We should be able to package multiple implementations in the same bundle (according to what downstream developers need for their applications) and direct the client to whatever implementation their file system requires. It would look something like that:

Open the file system and read the supernode.
Check that we have an implementation that matches the version in the supernode. If not, we have to bail out (this would be a downstream developer error).
At this point, the API should allow the client code to test if there's a newer implementation available so that downstream code can test for that and decide what to do. We can add all of the code required to do the actual upgrade, and leave it to the downstream code to decide when and how that gets done for any given application.

humphd · 2018-12-04T03:31:53Z

I need to sit with you in person and really understand your ideas with next. I haven't had time to look at it yet.

The idea around requesting an API version is interesting. I think storing code in the filesystem for reading the filesystem could be hard to get right. I know from working with Service Workers and Cache Storage, that it's really easy to get out of sync. I love the idea of doing it, but the devil's in the details.

Packaging multiple implementations in the bundle is unlikely to be popular with the current every-byte-on-the-wire-counts crowd. But I'm interested to explore this. Just because it's not often done, doesn't mean we shouldn't explore it.

For your step 2., another way you could do this is to put a URI/URL into the supernode along with the version (or the URI could be created from the version) and then use that to get the implementation on demand. If a user of an app comes online after a very long time, and has some ancient supernode layout/api needs, you'd need to be able to get the right API on the wire. Hoping you've got it bundled isn't likely to work, since you'd eventually bloat your bundle with edge case API versions.

I bet the packaging, versioning, filesystem layout problems have good solutions already we can look to when we do this. Higher bandwidth discussions are needed for this, but I'm keen to do that in new year.

In the meantime, let's try to get that other PR in shape to land, and then we can at least update our npm module and ship all the goodness I've added this fall.

modeswitch · 2018-12-04T04:57:54Z

Yeah, let's discuss details when we meet in person. Nothing beats a blackboard :)

Packaging multiple implementations in the bundle is unlikely to be popular with the current every-byte-on-the-wire-counts crowd.

This would be on the downstream application designer to decide which versions they want to support. They may decide never to change implementations, in which case this would be unnecessary.

Hoping you've got it bundled isn't likely to work, since you'd eventually bloat your bundle with edge case API versions.

Same as above. Whichever API the application uses is the one that would ultimately get bundled. We would need some work on the UX on our side, but building custom bundles (a la Lodash) isn't impossible.

humphd · 2018-12-04T18:12:35Z

@modeswitch before I land that mode fix you reviewed, I had a thought. How about if I create some kind of a "migration" test. Help me figure out the smallest possible set of things I need to make this valuable:

We start with a base commit (e.g., current version on npm). We build a filesystem, which could be created via a .zip or something to populate a bunch of stuff in the db.
We perform some set of filesystem operations on those files to simulate a typical workflow. Unlike our unit tests, we don't "zero" the db before each operation, but count on the state being reliable.
We then repeat those workflow steps with the next version we care about, and make sure it can mount and use everything without loss.

We could do this based on tagged versions of Filer, and have it work its way through the various published versions we care about supporting, making sure that the current one we want to ship doesn't make it impossible to read the files.

This sort of upgrade/migration testing must be pretty common, but I haven't done more of it. Do you have any suggestions on how I should approach?

modeswitch · 2018-12-04T18:42:49Z

Very interesting question!

I think this is a great place to use the memory provider. We should be able to create a file system and export it (will need to add some code to do that) as a JSON (encode binary data in base64). Then we can reload (again, will need new code for this) those exported file systems into newer Filers and test that they are handled appropriately.

We can get away with this because we aren't testing any of the underlying provider functionality.

humphd · 2018-12-04T19:02:58Z

OK, let me file a few bugs then. I think we should create a new provider based on memory, but not bloat it for the default bundle (i.e., people can choose to use this other memory+import/export provider by pulling it in manually). As I see it:

create a subclass of the memory provider that can import/export JSON (encode binary data in base64) with provider tests.
create a tool in node that create a JSON disk image: node cli that takes a path to a folder on disk, creates a Filer instance with the provider above, bridges node's fs to read all the data into Filer, then dumps the contents of the provider's backing to JSON.
create a test harness for loading a JSON filesystem image
create tests for loading different versions of Filer and using the imported filesystem to confirm data integrity.

I'll start filing these. If you think of more, toss it in here.

modeswitch · 2018-12-04T19:08:58Z

The modified memory provider could just live in with the test cases. There's no need to bundle it at all.

humphd · 2018-12-04T19:24:08Z

Filed #603, #604, #605, #606. Let's continue discussion of the testing aspects of this bug in those.

humphd · 2018-12-18T16:06:12Z

This is fixed. We can iterate on it, but it's safe to upgrade now, and we have tests to back it up.

humphd mentioned this issue Nov 18, 2018

Dependency Vulnerabilities #555

Closed

humphd added a commit to humphd/filer that referenced this issue Nov 19, 2018

Fix filerjs#566: support both node.mode and node.posix_mode

f89d38b

humphd mentioned this issue Dec 4, 2018

Experimental next branch #376

Open

humphd closed this as completed Dec 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent data loss on Filer upgrade, deal with changes to Node layout #566

Prevent data loss on Filer upgrade, deal with changes to Node layout #566

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Prevent data loss on Filer upgrade, deal with changes to Node layout #566

Prevent data loss on Filer upgrade, deal with changes to Node layout #566

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!