remko Github contribution chart
remko Github Stats
remko Most Used Languages

Activity

24 Sep 2022

Issue Comment

Remko

datastore: best practice on schema changes

Is your feature request related to a problem? Please describe.

Sent here after opening a support case with GCP for guidance (Case 25666310)

We are seeking guidance for best practice or patterns we can follow as we roll out datastore schema changes. This is specific to Go and the datastore library provided by this repo. Some other languages, such as Python or Node.js, do not impose strict restrictions on models so unknown fields are carried along and written back to datastore (e.g. Expando Properties) whereas in Go missing fields from the struct throw an error.

The problem seems to occur for a multi-service or backend designed application. For instance, an AppEngine API with 100s of instances and you want to slowly roll out a new version to test changes to your code including datastore schema changes. Or perhaps an API implemented purely with Google Cloud Functions. It may be impossible and/or undesirable to update all services at once that depend on a given datastore model. What ends up happening is that changes introduced in a new version starts to break code in old versions.

Describe the solution you'd like

We would like to see some documented best practice or general use case patterns we can follow to ensure smooth migrations in our datastore schema, hopefully sourced from successful and battle-hardened implementations.

This may be aided by a new features in this repo. Some ideas (trying to piggyback on the Expando model idea):

  • Introduce support in LoadStruct such that any unknown fields would be placed into a []datastore.Property or dastastore.PropertyList field found on the struct with a special case fieldname or struct tag option
  • Update LoadStruct to return a MultiError so we can track which fields failed to load and implement something like the above on our own while implementing the PropertyLoadSaver interface.

Describe alternatives you've considered

There are several use cases we can think of and ideas on how they can be implemented. The below are some thoughts on the topic, and although not explicitly discussed/shown below, we do follow a versioning pattern where every single Kind in datastore is given a version, and we keep code to migrate between versions available (implemented in a Load function satisfying the PropertyLoadSaver interface).

For example:

 const latestModelVersion = 4

type Model struct {
  Version int `datastore:"v"`
}


func (m *Model) Load(ps []datastore.Property) error {
  if err := datastore.LoadStruct(m, ps), err != nil {
      return err
   }
   switch m.Version {
     case latestModelVersion:
        break // no-op
     case 1:
      m.migrateV1ToV2()
      fallthrough
     case 2:
      m.migrateV2ToV3()
      fallthrough
     case 3:
      m.migrateV3ToV4()
     default:
        if m.Version > latestModelVersion {
            break // new version being rolled out
        }
        return fmt.Errorf("unexpected version encountered: %d", m.Version)
  }

  return nil
}

func (m *Model) Save() ([]datastore.Property, error) {
        m.Version = latestModelVersion
	return datastore.SaveStruct(m)
} 

Removing a property from the schema

This one seems straight forward:

  1. Roll out a version of your code removing all dependencies on the field you want to remove from the struct/schema
  2. After that code is fully rolled out, push a new version that actively removes the field in a PropertyLoadSaver implementation:
type Model struct {
  FieldToKeep int64
  // FieldToRemove string
}

func (m *Model) Load(ps []Property) error {
  for idx := range  ps {
      if ps[idx].Name == "FieldToRemove" {
        ps = append(ps[:idx], ps[idx+1:]...)
        break
     }
  }

  return datastore.LoadStruct(m, ps)
} 

Adding a property from the schema

A couple of ideas here:

For a 2-phase roll out:

  1. Roll out a version of your code introducing the new field as a pointer and with the omitempty struct tag option, make no use of this field yet
  2. After that code is fully rolled out, push a new version that then uses this field, changing it from a pointer and/or removing the omitempty struct tag option of necessary
type Model struct {
  FieldToKeep int64
  // FieldToRemove string
 FieldToIntroduce *time.Time `datastore:",omitempty"`
} 

For a single version roll out (preferred): Ensure you follow a pattern where your models implement some sort of "catch-all" feature, such that any unknown fields from datastore.LoadStruct are carried with the entity:

type Model struct {
  CatchAll []datastore.Property `datastore:"*,catchall"`
}


func (m *Model) Load(ps []datastore.Property) error {
  // If `LoadStruct` is updated to support the `catchall` property:
  if err := datastore.LoadStruct(m, ps), err != nil {
      return err
   }
  
  // If `LoadStruct` is updated to return a MultiError where entries in the slice that are non-nil correspond to the property in ps that couldn't be loaded
  if err := datastore.LoadStruct(m, ps), err != nil {
      if merr, ok := err.(datastore.MultiError); ok { // Maybe a MultiMismatchError
         m.CatchAll, err = getMissingProperties(ps, merr) // only returns `no such struct field` errors to the `CatchAll`, which doesn't necessarily need to be exported in this scenario
      }
      if err != nil {
        return err
      }
   }

  return nil
}

func (m *Model) Save() ([]datastore.Property, error) {
        ps, err := datastore.SaveStruct(m)
	return append(ps, m.CatchAll...), err
} 

Although this doesn't require a 2-phased rollout, updates to entities must always be done in a Read -> Mutate Original Entity -> Write to ensure that the CatchAll list remains in tact.

Modifying an existing property on the schema

This seems like a combination of introducing a new field and then removing the old one - but is there a easier/more efficient way to do this?

Additional context The above hasn't really been tested and requires some bit of overhead (changes done over 2 version rollouts). Are there known patterns/ways to improve upon the above?

Forked On 24 Sep 2022 at 07:48:21

Remko

@telpirion I'm not convinced this issue can be closed already. If the claim is that proper documentation is enough, then I'd like to see the documentation first, because I don't see it yet what this would say.

I currently have to write a lot of code and use code generators to make sure properties are not lost on e.g. rollbacks of the introduction of new properties. Other datastore SDKs don't suffer from this problem AFAIK (certainly not NDB, but I don't think the Python Datastore libraries do either, as they just store all properties dynamically on the model). A little bit of extra help in the Go datastore library to assist in storing unrecognized properties in an opaque way so they are written back on subsequent writes would be extremely useful here.

Commented On 24 Sep 2022 at 07:48:21

Remko

started

Started On 23 Sep 2022 at 05:19:14
Pull Request

Remko

Add BookWidgets to tester list

Created On 14 Sep 2022 at 12:43:06

Remko

Add BookWidgets to tester list

Pushed On 14 Sep 2022 at 12:41:52

Remko

A proposal for a cookie attribute to partition cross-site cookies by top-level site

Forked On 14 Sep 2022 at 12:39:06
Started

Remko

started

Started On 21 Aug 2022 at 06:06:42

Remko

WIP

Pushed On 20 Aug 2022 at 09:57:26
Create Branch
Remko In remko/waforth Create Branchwaforthc

Remko

Bootstrapping dynamic Forth Interpreter/Compiler for WebAssembly

On 20 Aug 2022 at 09:32:40
Create Branch
Remko In remko/go-sni Create Branchv0.1.0

Remko

Go implementation of Freedesktop.org StatusNotifierItem specification

On 20 Aug 2022 at 08:24:52

Remko

Initial commit

Pushed On 20 Aug 2022 at 08:19:38
Create Branch
Remko In remko/go-sni Create Branch

Remko

Go implementation of Freedesktop.org StatusNotifierItem specification

On 20 Aug 2022 at 07:44:05
Create Branch
Remko In remko/go-sni Create Branchmain

Remko

Go implementation of Freedesktop.org StatusNotifierItem specification

On 20 Aug 2022 at 07:44:05
Issue Comment

Remko

Implementing other languages with the same approach

First of all, major kudos and thank you for putting this together, I've been fascinated with this project since I found it, after finally diving a bit deeper into Forth. To be honest I can't get it out of my head for a few weeks now, I'm just filled with ideas for how I can use this kind of approach for other projects.

Is there any other Wasm language/runtime using the same kind of function dynamic linking? i.e. compiling words/functions to Wasm functions, loading them into a function ref table, and immediately having them available for further use in the language? I'm thinking this might be a great way to go for implementing other languages using the same kind of Forth-like hybrid interpreter-compiler runtime, such as Rebol/Red, or even other more dynamic languages based on a JIT compiler, have you by chance given this any thought @remko?

Forked On 19 Aug 2022 at 10:49:11

Remko

I didn't experiment with a C implementation. I don't think there's a good technical reason to use WebAssembly directly. I just wanted to see how far I could take the minimalism, and it's a good WebAssembly learning opportunity 😉 Forth's simplicity lends itself to such an approach, but for any higher-level language, I'd probably write the system in a high-level language as well. Any language that compiles to WebAssembly would do. C would be fine, but Zig might be more fun and more interesting.

Commented On 19 Aug 2022 at 10:49:11
Started

Remko

started

Started On 19 Aug 2022 at 09:34:48
Issue Comment

Remko

TOS caching

Hi,

I was quite surprised to see such a difference with gforth. My intuition is that it is due to the lack of TOS caching. Have you considered that approach?

Forked On 19 Aug 2022 at 09:00:30

Remko

I believe that's more or less what happens today (not at the time of the benchmarks).

For example, this is the implementation of the DROP word:

(func $DROP (param $tos i32) (result i32)
    (i32.sub (local.get $tos) (i32.const 4))) 

The TOS is passed as a parameter, and the new TOS is returned as a return value. All words have the same signature.

A word defined in terms of other words looks like this:

(func $2@  (param $tos i32) (result i32)
    (local.get $tos)
    (call $DUP)
    (call $CELL+)
    (call $@)
    (call $SWAP)
    (call $@)) 

Because WebAssembly's computational model is built around an implicit operand stack, the TOS can be implicitly threaded through all calls.

Commented On 19 Aug 2022 at 09:00:30
Issue Comment

Remko

TOS caching

Hi,

I was quite surprised to see such a difference with gforth. My intuition is that it is due to the lack of TOS caching. Have you considered that approach?

Forked On 19 Aug 2022 at 07:23:34

Remko

Yeah, those benchmarks are outdated.

In WebAssembly, you don't have any control over registers.

As mentioned in the Design document, subroutine threading is the only alternative in WebAssembly. Direct/indirect threading are not possible due to the design of WebAssembly. You could implement a static dispatching to built-in words using something like br_table, but even if that would end up being more efficient (which would depend on the JIT compiler), it would still not work with compiled words, and you'd end up with 2 execution mechanisms (which is too much complexity for me).

Commented On 19 Aug 2022 at 07:23:34
Issue Comment

Remko

TOS caching

Hi,

I was quite surprised to see such a difference with gforth. My intuition is that it is due to the lack of TOS caching. Have you considered that approach?

Forked On 19 Aug 2022 at 06:42:30

Remko

Hi @jacereda

Where did you see the difference? Did you do your own measurements, or are you referring to the toy tests I once did? If the latter, I wouldn't trust any of those results:

  • A lot has changed in browser WebAssembly implementation since then, both good (speedups) and bad (Spectre mitigations). I have no idea in which direction this would go.
  • A lot has changed in WAForth itself in terms of design (see below)
  • For GForth times, I did not look at any flags. For all I know, I could have used a very inefficient setup
  • Any benchmarking I did much later than those did not correspond to what I saw back then. No idea what changed, maybe there was a bug in my benchmark setup.

That said, my gut feeling would say that Gforth should be faster. The subroutine threading using indirect calls used in WAForth is probably the biggest overhead, although it could be that Gforth has more control over other things such as where to keep the TOS as well.

I don't know what you mean concretely by TOS caching. At the time of those benchmarks, the TOS was stored as an (unexported) mutable global variable. This means that changing the TOS from within a compiled word needed a call into the main module, which is obviously expensive. Switching to exported mutable globals (so the global TOS var could be manipulated from within a compiled word) didn't help much, and caused some significant slowdowns in some browsers even (maybe some protection mechanism). I changed the design to replace the TOS global variable by a first parameter being threaded through every call, and returned by every call. This also allowed me to inline any pushes and pops, which gave a total speedup of 43% by my very simple benchmark.

I haven't had the time to figure out how to look at the generated machine code by the WASM engines. Ideally, the TOS would be in a register most of the time, but I would be surprised if this was the case. I think that by threading the TOS as a first parameter, I put it as close to the execution path as possible without having any control over where variables are located, but I can only guess.

I'm planning to have an optional (external) postprocessor that combines all WAForth-compiled WASM modules into one big module, and replaces indirect calls with direct calls. This would get rid of all the overhead due to indirect calls, but it will probably still not be as efficient as direct threading.

Commented On 19 Aug 2022 at 06:42:30
Issue Comment

Remko

Implementing other languages with the same approach

First of all, major kudos and thank you for putting this together, I've been fascinated with this project since I found it, after finally diving a bit deeper into Forth. To be honest I can't get it out of my head for a few weeks now, I'm just filled with ideas for how I can use this kind of approach for other projects.

Is there any other Wasm language/runtime using the same kind of function dynamic linking? i.e. compiling words/functions to Wasm functions, loading them into a function ref table, and immediately having them available for further use in the language? I'm thinking this might be a great way to go for implementing other languages using the same kind of Forth-like hybrid interpreter-compiler runtime, such as Rebol/Red, or even other more dynamic languages based on a JIT compiler, have you by chance given this any thought @remko?

Forked On 18 Aug 2022 at 06:14:36

Remko

@jpaquim here is a link that will probably interest you. It talks about the same techniques used in WAForth, and comes with Scheme examples.

Commented On 18 Aug 2022 at 06:14:36
Issue Comment

Remko

Changing sort with a namespace selected looses the namespace

When I select a namespace, i can see all the data in that namespace. When I then click on the tableheader to change the order, the namespace resets back to the default namespace.

The /namespaces/namespaceID/ part in the url seems to be missing. When I manually add this namespace part to the url, is see the expected data in the expected order.

I just noticed: The same issue seem to occur when clicking on the next/previous buttons or selecting a different rows per page.

Thank you for creating this great tool. This really is the best tool I found.

Forked On 18 Aug 2022 at 10:45:33

Remko

Good catch, thanks!

Should be fixed in v0.18.1.

Commented On 18 Aug 2022 at 10:45:33

Remko

v0.18.1

Pushed On 18 Aug 2022 at 10:41:09
Create Branch
Remko In remko/dsadmin Create Branchv0.18.1

Remko

Google Cloud Datastore Emulator Administration UI

On 18 Aug 2022 at 10:41:08

Remko

Fix namespace switch when paging or sorting

Fixes #5

Pushed On 18 Aug 2022 at 10:40:15
Issue Comment

Remko

Implementing other languages with the same approach

First of all, major kudos and thank you for putting this together, I've been fascinated with this project since I found it, after finally diving a bit deeper into Forth. To be honest I can't get it out of my head for a few weeks now, I'm just filled with ideas for how I can use this kind of approach for other projects.

Is there any other Wasm language/runtime using the same kind of function dynamic linking? i.e. compiling words/functions to Wasm functions, loading them into a function ref table, and immediately having them available for further use in the language? I'm thinking this might be a great way to go for implementing other languages using the same kind of Forth-like hybrid interpreter-compiler runtime, such as Rebol/Red, or even other more dynamic languages based on a JIT compiler, have you by chance given this any thought @remko?

Forked On 16 Aug 2022 at 08:19:41

Remko

@jpaquim Thanks! Very nice to hear this work has inspired you!

I don't know if other projects take this approach, but I must admit i'm not following everything that's going on in the WASM scene. WAForth is pretty hard-core because the compiler is implemented in WebAssembly as well; it's probably easier to use a high-level language to implement a MyLanguage-to-WASM compiler (or use an existing one, possibly cross-compiled to WebAssembly itself). I'm pretty sure other WASM compilers support dynamic loading too, since there are proposals to improve things such as position-independent-code. I don't know if you can do interactive-style development with any of these, though.

Commented On 16 Aug 2022 at 08:19:41

Remko

started

Started On 14 Aug 2022 at 11:00:39
Started

Remko

started

Started On 10 Aug 2022 at 09:01:32

Remko

started

Started On 31 Jul 2022 at 08:04:31

Remko

Support importing memory & table

Does this project have any plan to implement supports for importing memories and tables when instantiating a WebAssembly Instance? Besides, any plan to implement the grow method of memories and tables?

Forked On 28 Jul 2022 at 05:20:31

Remko

@lum1n0us In my case, my memory and table isn’t host-made. They’re simply imported from another module.

Commented On 28 Jul 2022 at 05:20:31
Issue Comment

Remko

Bulk memory problem with wat2wasm + wasm2c

I have a WASM program written in text format that uses bulk memory operations. When using wat2wasm, and then passing that output to wasm2c, I get the following error:

error: unexpected opcode: 0xfc 0xa

I'm using WABT 1.0.29 on macOS.

Forked On 13 Jul 2022 at 06:45:33

Remko

@keithw Thanks. That branch works for me. Looking forward to seeing it merged.

Commented On 13 Jul 2022 at 06:45:33