Query JSON Data in Object Storage using External Tables
With Autonomous AI Database, you can access JSON documents stored in Oracle Cloud Infrastructure Object Storage using external tables.
Depending on the database version, you can run queries on JSON documents in different formats.
-
Query JSON documents as JSON objects with Oracle AI Database 26ai
-
Query JSON documents as textual data with Oracle Database 19c or earlier
Autonomous AI Database processes the following general JSON document representation in external files:
-
Multiple single line JSON documents within files
-
Single multiline JSON document per file
Here are some examples explaining these in detail.
- Examples: Querying JSON Documents as JSON Objects
Autonomous AI Database allows running queries on JSON documents as JSON objects. - Examples: Querying JSON Documents as Textual Data
Autonomous AI Database allows running queries on JSON documents as textual data.
Parent topic: Query External Data with Autonomous AI Database
Examples: Querying JSON Documents as JSON Objects
Autonomous AI Database allows running queries on JSON documents as JSON objects.
You can query various types of JSON documents as objects, such as:
-
Line-delimited JSON documents
-
JSON arrays
-
Object wrapped JSON arrays
-
Extended JSON (EJSON) Support
-
Single JSON document with multiline files
The following examples demonstrate how to access JSON documents as objects
with Oracle AI Database 26ai using the format option
jsondoc.
Example 1: Query Line-delimited JSON Documents as JSON Objects
This example shows how to query a JSON file containing multiple line-delimited JSON documents in object storage.
A JSON file, for example, fruitLineDelimited.json
exists in the object storage that has three lines with one object per line.
{ "name": "apple", "count": 20 }
{ "name": "orange", "count": 42 }
{ "name": "pear", "count": 10 }
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_1', credential_name =>'DEF_CRED_NAME', format => json_object('type' value 'jsondoc'), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitLineDelimited.json'; END; /The parameters are:
-
table_name: is the external table name. -
credential_name: is the name of the credential created in the previous step. Thecredential_nameparameter must conform to Oracle object naming conventions. See Database Object Naming Rules for more information. -
format: defines the options you can specify to describe the format of the source file. -
file_uri_list: is a comma delimited list of the source files you want to query.
-
-
Query the external table using the
SELECTstatement.SELECT * FROM fruit_1;
Example 2: Query JSON Arrays as JSON Objects
This example shows how to query a single array of JSON objects. Arrays are unpacked by default.
A JSON file, for example, fruitArray.json exists in the
object storage and has the following data.
[
{
"name" : "apple",
"count": 20
},
{
"name" : "orange",
"count": 42
},
{
"name" : "pear",
"count": 10
}
]
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_2a', credential_name =>'DEF_CRED_NAME', format => json_object('type' value 'jsondoc'), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitArray.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_2a; -
You can use the
jsonpathoption to preserve the array structure and return an array of JSON object.BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_2b', credential_name =>'DEF_CRED_NAME', format => json_object('type' value 'jsondoc', 'jsonpath' value '$'), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitArray.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_2b;
Example 3: Query Object Wrapped JSON Arrays as JSON Objects
This example shows how to query a JSON file wrapped in an outer JSON
document. In this example, you provide a path using the format option
jsonpath to the data that you want to load. The path must lead
to an array. The rows are mapped as in the previous example.
A JSON file, for example, fruitEmbeddedArray.json
exists in the object storage and has the following data:
{
"last_updated": 1434054678,
"ttl": 0,
"version": "1.0",
"fruit": [
{ "name" : "apple", "count": 20 },
{ "name" : "orange", "count": 42 },
{ "name" : "pear", "count": 10 }
]
}
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_3', credential_name =>'DEF_CRED_NAME', format => json_object('type' value 'jsondoc', 'jsonpath' value '$.fruit[*]'), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitEmbeddedArray.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_3;
Example 4: Query Extended JSON (EJSON) Documents as JSON Objects
This example shows how to query an EJSON file. The SQL type JSON can represent extended JSON types such as TIMESTAMP, DOUBLE, FLOAT, and RAW. The JSON text can represent extended JSON types by using the extended JSON format. These EJSON annotations are automatically converted to the corresponding types.
An EJSON file, for example, fruitEjson.json exists in
the object storage and has the following data:
{ "name" : "apple", "count": 20, "modified":{"$date":"2020-06-29T11:53:05.439Z"} }
{ "name" : "orange", "count": 42 }
{ "name" : "pear", "count": 10 }
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_4', credential_name =>'DEF_CRED_NAME', format => json_object('type' value 'jsondoc'), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitEjson.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_4;
Example 5: Query a Single JSON Document with Multiline Files as a JSON Object
This example shows how to query a single JSON document with multiline files.
-
A single JSON document with multiline files can be mapped to a table, where each JSON file in the directory is mapped to a single row.
-
A single JSON document with multiline files can be a directory containing JSON files where each JSON file is mapped to a single row in the table.
This example uses JSON files containing the following data:
fruitMultiLineFile1.json
{
"name" : "apple",
"count": 42
}
fruitMultiLineFile2.json
{
"name" : "orange",
"count": 5
}
fruitMultiLineFile3.json
{
"name" : "pear",
"count": 10
}
-
Create an external table in your database using these JSON files.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_5', credential_name =>'DEF_CRED_NAME', format => json_object('type' value 'jsondoc'), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitMultiLine*.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_5;
Parent topic: Query JSON Data in Object Storage using External Tables
Examples: Querying JSON Documents as Textual Data
Autonomous AI Database allows running queries on JSON documents as textual data.
Oracle Database 19c does not support a native JSON
data type. So, an external data in JSON data representation is handled as textual data.
Consequently, JSON data in external objects is processed using the
VARCHAR2, CLOB, or BLOB data
types.
Points to Consider While Accessing JSON Documents
-
Define the field
delimiteroption to a value or character that is not in your JSON data. In other words, the field delimiter value does not exist in your JSON documents. When you create a table with theDBMS_CLOUDpackage, the default field delimiter is ‘|’ (pipe).Let us consider a scenario where a JSON data contains ‘
|’. For example:{"attr1": "oK4IJ|V", "attr2": "igN”}If the field delimiter is not specified, ‘
|’ is considered as the field delimiter and a query returns the following.{"attr1": "oK4IJTo ensure that your JSON documents are not split unintentionally, rendering the data to invalid JSON documents, set the field delimiter format option to a value that is not in the data. For example:
json_object('delimiter' value 'X''0''') -
If your JSON documents are stored as single document line-delimited data, they are processed as textual data without any issues. For a JSON document that spans across multiple rows in an external file, you must set the format option
recorddelimiterto a value or character that does not exist in the data file. For example:json_object('recorddelimiter' value '0x''1B''')If the
recorddelimiteroption is not set, every line is considered as an individual record. The default record delimiternewlinerenders the data to invalid JSON documents. -
In an external data file, if a field is not explicitly specified then the default data type is set to
CHAR(255). Thus, reading JSON documents larger than 255 characters returns an error. Therefore, you must set the parameterfield_listto a value higher than the largest JSON document within your files. For example:field_list =>'"MYDATA" char(10000)' -
The default read buffer size in Autonomous Database is 10 MB. For JSON documents larger than 10 MB, you must set the format option
readsizeto a higher value, let's say 20 MB.json_object('readsize' value '20000000')
Here is a sample external table definition that includes all these parameters.
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
table_name =>'fruit_5',
credential_name =>'DEF_CRED_NAME',
format => json_object('type' value 'jsondoc'),
file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitMultiLine*.json';
END;
/
The following examples demonstrate how to access JSON documents as
textual data with Oracle Database 19c or earlier using
the CLOB data type. Alternatively, you can use the
VARCHAR2 data type depending on the size of your JSON
documents.
-
Line-delimited JSON documents
-
JSON arrays
-
Object wrapped JSON arrays
-
Single JSON document with multiline files
Example 1: Query Line-delimited JSON Documents as Textual Data
This example shows how to query a JSON file containing multiple line-delimited JSON documents in object storage.
A JSON file, for example, fruitLineDelimited.json
exists in the object storage that has three lines with one object per line.
{ "name": "apple", "count": 20 }
{ "name": "orange", "count": 42 }
{ "name": "pear", "count": 10 }
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_6', column_list => 'mydata clob', credential_name =>'DEF_CRED_NAME', field_list =>'"MYDATA" char(10000)', format => json_object('delimiter' value 'X''0'''), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitLineDelimited.json'; END; /The parameters are:
-
table_name: is the external table name. -
column_list: is a comma delimited list of column names and data types for the external table. The list includes the columns inside the data file and those derived from the object name (from names in the file path specified byfile_uri_list). -
credential_name: is the name of the credential created in the previous step. Thecredential_nameparameter must conform to Oracle object naming conventions. See Database Object Naming Rules for more information. -
field_list: identifies the fields in the source files and their data types. -
format: defines the options you can specify to describe the format of the source file. -
file_uri_list: is a comma delimited list of the source files you want to query.
-
-
Query the external table using the
SELECTstatement.SELECT * FROM fruit_6;
Example 2: Query JSON Arrays as Textual Data
This example shows how to query a single array of JSON objects. You cannot automatically unpack arrays as part of the external table data access, but you can do the unpacking while accessing the data.
A JSON file, for example, fruitArray.json exists in the
object storage and has the following data.
[
{
"name" : "apple",
"count": 20
},
{
"name" : "orange",
"count": 42
},
{
"name" : "pear",
"count": 10
}
]
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_7', column_list => 'mydata clob', credential_name =>'DEF_CRED_NAME', field_list =>'"MYDATA" char(10000)', format => json_object('delimiter' value 'X''0''','recorddelimiter' value '0x''1B'''), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitArray.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_7 NESTED mydata COLUMNS (nested path '$[*]' COLUMNS (data format json path '$'));
Example 3: Query Object Wrapped JSON Arrays as Textual Data
This example shows how to query a JSON file wrapped in an outer JSON document. You cannot automatically unpack arrays as part of the external table data access, but you can do the unpacking and JSON document subsetting while accessing the data.
A JSON file, for example, fruitEmbeddedArray.json
exists in the object storage and has the following data:
{
"last_updated": 1434054678,
"ttl": 0,
"version": "1.0",
"fruit": [
{ "name" : "apple", "count": 20 },
{ "name" : "orange", "count": 42 },
{ "name" : "pear", "count": 10 }
]
}
-
Create an external table in your database using the JSON file.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_8', column_list => 'mydata clob', credential_name =>'DEF_CRED_NAME', field_list =>'"MYDATA" char(10000)', format => json_object('delimiter' value 'X''0''','recorddelimiter' value '0x''1B'''), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitEmbeddedArray.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_8 NESTED mydata COLUMNS (nested path '$.fruit[*]' COLUMNS (data format json path '$'));
Example 4: Query a Single JSON Document with Multiline Files as Textual Data
This example shows how to query a single JSON document with multiline files.
-
A single JSON document with multiline files can be mapped to a table, where each JSON file in the directory is mapped to a single row.
-
A single JSON document with multiline files can be a directory containing JSON files where each JSON file is mapped to a single row in the table.
This example uses JSON files containing the following data:
fruitMultiLineFile1.json
{
"name" : "apple",
"count": 42
}
fruitMultiLineFile2.json
{
"name" : "orange",
"count": 5
}
fruitMultiLineFile3.json
{
"name" : "pear",
"count": 10
}
-
Create an external table in your database using these JSON files.
BEGINDBMS_CLOUD.CREATE_EXTERNAL_TABLE( table_name =>'fruit_9', column_list => 'mydata clob', credential_name =>'DEF_CRED_NAME', field_list =>'"MYDATA" char(10000)', format => json_object('delimiter' value 'X''0''','recorddelimiter' value '0x''1B'''), file_uri_list =>'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/fruitMultiLine*.json'; END; / -
Query the external table using the
SELECTstatement.SELECT * FROM fruit_9;
For more information on JSON documents, see JSON Document Stores
Parent topic: Query JSON Data in Object Storage using External Tables