Clown, how dataset in updating records masturbation, yum! this
The Socrata Publisher API allows you to create, update, and delete rows in a single operation, using their row identifiers. This is an excellent way to keep your Socrata dataset in sync with an internal system.
Please note that all operations that modify datasets must be authenticated as a user who has access to modify that dataset, and must be accompanied by an application token.
Creating Your Upsert Payload
The dataset for this example is the USGS Earthquakes Sample Dataset, which has its publisher-specified row identifier set to .
We’ll format our upsert payload as a JSON array of objects. In this payload, we’ll be:
- Creating one new earthquake,
- Updating a second,
- And deleting a third,
This example contains only a few records, but upsert operations can easily update thousands of records at a time.
Note a few things about this payload:
- The only difference between the create object for and the update object for is that a row doesn’t exist with that identifier for the first, while one does for the second. The server will automatically figure out if a record already exists, and update it, rather than creating a duplicate.
- For the row we’re deleting, , we’ve included the special key, with a value of . This tells the server to delete the record matching that ID. We also don’t need to include the rest of the object.
These things combined means that this upsert operation is entirely idempotent. The operation can be retried an infinite number of times, and the state of the dataset will be the same after every time. This makes upsert requests very safe to retry if something goes wrong.
Performing Your Upsert
Once you’ve constructed your payload, upserting it is as simple as ing it to your dataset’s endpoint, along with the appropriate authentication and application token information:
You’ll get back a response detailing what went right or wrong:
That means we created one record, updated a second, and deleted a third. If we were to retry that operation again, we’d get the following:
Since our record was already created, our “create” became an “update”. And because our third row was already deleted, there was nothing to delete with that ID.
Upserting with CSV
You can use properly-formatted “Comma Separated Value” (CSV) data to do updates, just like you can do with JSON. Just make sure you follow a few rules:
- Your data should be compliant with the IETF RFC 4810 CSV specification. That means:
- Fields are separated by commas and records are separated by newlines
- Fields can be optionally wrapped in double quotes ()
- You can embed a newline within a field by wrapping it in quotes. Newlines will be ignored until the field is terminated by another double quote
- If a double quote occurs within a quoted field, the quote can be escaped by doubling it (i.e., would become )
- The first line in your file must be a “header row” that contains the API field names for each of the fields in your data file. That header will be used to determine the order of the fields in the records below
Here’s an example:
Just like before, upserting it is as simple as ing it to your dataset’s endpoint, along with the appropriate authentication and application token information. Make sure you use a content type of :
You’ll get back a response like you did in the previous example, detailing what went right and wrong:
Finally, note that appending and upserting geographic information to be geo-coded by Socrata requires writing a string into a single location column. Please refer to our Support Portal documentation for specific information on formatting this string:
Location Information Which Can Be Geo-coded
Importing, Data Types, and You