REST Spec: Add unregister table endpoint#16400
Conversation
nssalian
left a comment
There was a problem hiding this comment.
Left a comment, but I think the branch needs a rebase
Co-Authored-By: Claude Code (Opus 4.7, 1M context) <[email protected]>
c1bf5d7 to
86349c9
Compare
| Unregister a table from the catalog. This is the opposite of | ||
| `registerTable`. The table no longer exists in the catalog, but the | ||
| underlying data and metadata files are left in place so that the table | ||
| can be registered in another catalog. |
There was a problem hiding this comment.
can you also add comment to explain the difference between this and the delete request with purgeRequested set to false? And may be also updating the description of delete especially regarding the purgeRequested semantics.
There was a problem hiding this comment.
This seems unnecessary to me. The DELETE request is called dropTable and the semantics of dropping a table are clear in most SQL systems and are no different in this spec. The purge flag is also already correct: it signals whether the caller requested a purge; how to interpret that is not dictated by the spec.
In addition, this description already calls out the relevant parts: that data and metadata are left in place so they can be registered in another catalog.
I think this is another example of "less is more" in our specs. If we were to call out that "unlike drop" this doesn't delete data and metadata, we're making the requirements more confusing because we don't specify when data should be deleted (if at all) for DROP. It may seem like a good idea to point out how this differs from drop, but I think the specification is more clear if this focuses on the correct behavior of this endpoint and let the drop documentations focus on its correct behavior.
There was a problem hiding this comment.
the semantics of dropping a table are clear in most SQL systems
True for the traditional DBs, but not so for the data lake engines. Both AWS Redshift and AWS Athena do not purge table data when running DROP TABLE. Apache Spark also says for external table, DROP TABLE is merely a metadata operation.
I am afraid that if we don't call out the semantics of purgeRequested clearly, people might face catastrophic data loss when they interpret purgeRequested = false in a wrong way.
There was a problem hiding this comment.
An interesting experiment using Claude and the latest mainline. If LLM is confused, I bet many human, and human who use LLMs, would be confused too.
Read rest-catalog-open-api.yaml carefully. Now if I want to implement 'DROP TABLE' for a query
engine, which only removes the table entry from catalog, but not the data and metadata on cloud storage, which IRC interface should I use, what parameters should be set?
Searched for 1 pattern, read 1 file (ctrl+o to expand)
● You should use the DELETE /v1/{prefix}/namespaces/{namespace}/tables/{table} endpoint (operation: dropTable).
To only remove the catalog entry without purging data and metadata on storage, set the query parameter:
purgeRequested=false
This is also the default (default: false), so you can simply omit the parameter entirely. The request would look like:
DELETE /v1/{prefix}/namespaces/{namespace}/tables/{table}
A successful response returns 204 No Content.
The key distinction: when purgeRequested=true, the catalog is asked to also delete the underlying data and metadata files. When false (or
omitted), only the catalog entry is removed — the files on cloud storage remain intact.
✻ Baked for 15s
This adds an endpoint to
unregistera table from a REST catalog without deleting data or metadata files. Aregisterendpoint already exists to add a table to a REST catalog, since most migrations have been to REST from Hive or other catalogs. But anunregisterendpoint is needed to safely migrate from one REST catalog to another through the API.This uses an empty POST to
unregisterunder a table resource (/v1/{prefix}/namespaces/{namespace}/tables/{table}/unregister), rather than along side theregisterendpoint, which would require an unregister-specific request.The request and endpoint structure were co-authored by Claude Code (Opus 4.7).