Forum Discussion
martin_larsson_ellevio
May 12, 2025Copper Contributor
Error in copy activity with Oracel 2.0
I am trying to migrate our copy activities to Oracle connector version 2.0. The destination is parquet in Azure Storage account which works with Oracle 1.0 connecter.
Just switching to 2.0 on the linked service and adjusting the connection string (server) is straight forward and a "test connection" is successful. But in a pipeline with a copy activity using the linked service I get the following error message on some tables.
ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.ArrayIndexOutOfBoundsException:255
total entry:1
com.microsoft.datatransfer.bridge.parquet.ParquetWriterBuilderBridge.addDecimalColumn(ParquetWriterBuilderBridge.java:107)
.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'
As the error suggests in is unable to convert a decimal value from Oracle to Parquet. To me it looks like a bug in the new connector. Has anybody seen this before and have found a solution? The 1.0 connector is apparently being deprecated in the coming weeks.
Here is the code for the copy activity:
{
"name": "Copy",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "1.00:00:00",
"retry": 2,
"retryIntervalInSeconds": 60,
"secureOutput": false,
"secureInput": false
},
"userProperties": [
{
"name": "Source",
"value": "@{pipeline().parameters.schema}.@{pipeline().parameters.table}"
},
{
"name": "Destination",
"value": "raw/@{concat(pipeline().parameters.source, '/', pipeline().parameters.schema, '/', pipeline().parameters.table, '/', formatDateTime(pipeline().TriggerTime, 'yyyy/MM/dd'))}/"
}
],
"typeProperties": {
"source": {
"type": "OracleSource",
"oracleReaderQuery": {
"value": "SELECT @{coalesce(pipeline().parameters.columns, '*')}\nFROM \"@{pipeline().parameters.schema}\".\"@{pipeline().parameters.table}\"\n@{if(variables('incremental'), variables('where_clause'), '')}\n@{if(equals(pipeline().globalParameters.ENV, 'dev'),\n'FETCH FIRST 1000 ROWS ONLY'\n,''\n)}",
"type": "Expression"
},
"partitionOption": "None",
"convertDecimalToInteger": true,
"queryTimeout": "02:00:00"
},
"sink": {
"type": "ParquetSink",
"storeSettings": {
"type": "AzureBlobFSWriteSettings"
},
"formatSettings": {
"type": "ParquetWriteSettings",
"maxRowsPerFile": 1000000,
"fileNamePrefix": {
"value": "@variables('file_name_prefix')",
"type": "Expression"
}
}
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"typeConversion": true,
"typeConversionSettings": {
"allowDataTruncation": true,
"treatBooleanAsNumber": false
}
}
},
"inputs": [
{
"referenceName": "Oracle",
"type": "DatasetReference",
"parameters": {
"host": {
"value": "@pipeline().parameters.host",
"type": "Expression"
},
"port": {
"value": "@pipeline().parameters.port",
"type": "Expression"
},
"service_name": {
"value": "@pipeline().parameters.service_name",
"type": "Expression"
},
"username": {
"value": "@pipeline().parameters.username",
"type": "Expression"
},
"password_secret_name": {
"value": "@pipeline().parameters.password_secret_name",
"type": "Expression"
},
"schema": {
"value": "@pipeline().parameters.schema",
"type": "Expression"
},
"table": {
"value": "@pipeline().parameters.table",
"type": "Expression"
}
}
}
],
"outputs": [
{
"referenceName": "Lake_PARQUET_folder",
"type": "DatasetReference",
"parameters": {
"source": {
"value": "@pipeline().parameters.source",
"type": "Expression"
},
"namespace": {
"value": "@pipeline().parameters.schema",
"type": "Expression"
},
"entity": {
"value": "@variables('sink_table_name')",
"type": "Expression"
},
"partition": {
"value": "@formatDateTime(pipeline().TriggerTime, 'yyyy/MM/dd')",
"type": "Expression"
},
"container": {
"value": "@variables('container')",
"type": "Expression"
}
}
}
]
}
I need to add to this that the solution to my situation was to add the following line to the Parquet dataset json code:
"useParquetV2": true
at the same level as compressionCodec
in combination with adaardor suggestion with "Support V1 data types" above. Without it the Parquet file could not be read by Databricks with the error that decimal precision of 256 is not allowed, max 38.
Full code for Parquet dataset:
{
"name": "Lake_PARQUET_folder",
"properties": {
"linkedServiceName": {
"referenceName": "DataLake",
"type": "LinkedServiceReference"
},
"parameters": {
"source": {
"type": "string"
},
"namespace": {
"type": "string"
},
"entity": {
"type": "string"
},
"partition": {
"type": "string"
},
"container": {
"type": "string",
"defaultValue": "raw"
}
},
"folder": {
"name": "Data Lakes"
},
"annotations": [],
"type": "Parquet",
"typeProperties": {
"location": {
"type": "AzureBlobFSLocation",
"folderPath": {
"value": "@concat(dataset().source, '/', dataset().namespace, '/', dataset().entity, '/', dataset().partition)",
"type": "Expression"
},
"fileSystem": {
"value": "@dataset().container",
"type": "Expression"
}
},
"compressionCodec": "gzip",
"useParquetV2": true
},
"schema": []
},
"type": "Microsoft.DataFactory/factories/datasets"
}
5 Replies
Sort By
- NikhilverCopper Contributor
Hi , Can you please precisely show where did you keep the setting "Support V1 data types" as mentioned above ?
- martin_larsson_ellevioCopper Contributor
It is shown in the reply from adaardor in this thread. Click the "Show entire thread" or something similar to view all replies.
- adaardorCopper Contributor
In your linked service connection settings, have you tried the additional connection properties to support the v1 data types?
- martin_larsson_ellevioCopper Contributor
Now I have tried it and it works!! Thank you
- martin_larsson_ellevioCopper Contributor
I need to add to this that the solution to my situation was to add the following line to the Parquet dataset json code:
"useParquetV2": true
at the same level as compressionCodec
in combination with adaardor suggestion with "Support V1 data types" above. Without it the Parquet file could not be read by Databricks with the error that decimal precision of 256 is not allowed, max 38.
Full code for Parquet dataset:
{
"name": "Lake_PARQUET_folder",
"properties": {
"linkedServiceName": {
"referenceName": "DataLake",
"type": "LinkedServiceReference"
},
"parameters": {
"source": {
"type": "string"
},
"namespace": {
"type": "string"
},
"entity": {
"type": "string"
},
"partition": {
"type": "string"
},
"container": {
"type": "string",
"defaultValue": "raw"
}
},
"folder": {
"name": "Data Lakes"
},
"annotations": [],
"type": "Parquet",
"typeProperties": {
"location": {
"type": "AzureBlobFSLocation",
"folderPath": {
"value": "@concat(dataset().source, '/', dataset().namespace, '/', dataset().entity, '/', dataset().partition)",
"type": "Expression"
},
"fileSystem": {
"value": "@dataset().container",
"type": "Expression"
}
},
"compressionCodec": "gzip",
"useParquetV2": true
},
"schema": []
},
"type": "Microsoft.DataFactory/factories/datasets"
}