MemSQL Spark Connector 3.0 beta upsert option for ColumnStore table

I just switched to your latest beta version of spark connector 3.0.
Just curious, is there a way i can do upsert action for ColumnStore table ?


Hi Aleks,

Thanks for reaching out!

Before I answer your question, I want to note that we released a release candidate (RC) version of the connector; this has a few updates from our beta versions which you can find in the changelog.
Release Candidate Spark 3.0

As per your question, the Connector has an on duplicate key update which you can use to specify a rule when updating a row. It also has a merge on overwrite setting, which updates all rows that match on a primary key. At present, both of these options work for rowstore only since the latter requires a primary key and the former requires support for a primary key and duplicate key update operations in columnstore. Documentation on these features is as follows:

Merging on Save
On Duplicate Key Update

That said, in 7.1(currently in beta), we will be releasing unique constraints support on columnstore, and we will be looking into whether we can apply the same logic that we use to merge on rowstore with a primary key on a columnstore unique key instead.

If ‘merge’ won’t work for your use case, and you require updating a field based on a given rule through On Duplicate Key Update, our current plan is to support that in the next major version of MemSQL which will be released towards the end of this year. Once that is in place, we can update the connector to support this operation for both the rowstore and columnstore.

As a side note, supporting unique constraints and duplicate key update on columnstore is all part of MemSQL’s single store strategy: