apache · zhouyuan · Apr 15, 2026 · Apr 14, 2026
diff --git a/cpp-ch/local-engine/Parser/aggregate_function_parser/BloomFilterAggParser.cpp b/cpp-ch/local-engine/Parser/aggregate_function_parser/BloomFilterAggParser.cpp
@@ -56,7 +56,7 @@ DB::Array AggregateFunctionParserBloomFilterAgg::parseFunctionParameters(
 {
     if (func_info.phase == substrait::AGGREGATION_PHASE_INITIAL_TO_INTERMEDIATE || func_info.phase == substrait::AGGREGATION_PHASE_INITIAL_TO_RESULT)
     {
-        auto get_parameter_field = [](const DB::ActionsDAG::Node * node, size_t /*paramter_index*/) -> DB::Field
+        auto get_parameter_field = [](const DB::ActionsDAG::Node * node, size_t /*parameter_index*/) -> DB::Field
         {
             Field ret;
             node->column->get(0, ret);

diff --git a/docs/developers/HowTo.md b/docs/developers/HowTo.md
@@ -156,7 +156,7 @@ gdb ${GLUTEN_HOME}/cpp/build/releases/libgluten.so 'core-Executor task l-2000883
 Currently, we have no dedicated memory allocator implemented by jemalloc. User can set environment variable `LD_PRELOAD` for lib jemalloc
 to let it override the corresponding C standard functions entirely. It may help alleviate OOM issues.
 
-`spark.executorEnv.LD_PREALOD=/path/to/libjemalloc.so`
+`spark.executorEnv.LD_PRELOAD=/path/to/libjemalloc.so`
 
 # How to run TPC-H on Velox backend
 

diff --git a/docs/developers/UsingGperftoolsInCH.md b/docs/developers/UsingGperftoolsInCH.md
@@ -11,7 +11,7 @@ We need using gpertools to find the memory or CPU issue. That's what this docume
 Install gperftools as described in https://github.com/gperftools/gperftools.
 We get the library and the command line tools.
 
-## Compiler libch.so
+## Compile libch.so
 Disable jemalloc `-DENABLE_JEMALLOC=OFF` in cpp-ch/CMakeLists.txt, and recompile libch.so.
 
 ## Run Gluten with gperftools

diff --git a/docs/developers/UsingJemallocWithCH.md b/docs/developers/UsingJemallocWithCH.md
@@ -28,7 +28,7 @@ cd $Clickhouse_SOURCE_PATH/contrib/jemalloc && ./autogen.sh && ./configure.sh &&
 ```
 Then we get jeprof in the directory `$Clickhouse_SOURCE_PATH/contrib/jemalloc/bin/jeprof`.
 
-## Compiler libch.so
+## Compile libch.so
 Ensure to enable jemalloc `-DENABLE_JEMALLOC=ON` in cpp-ch/CMakeLists.txt, and compile libch.so.
 
 ## Run Gluten with jemalloc heap tools

diff --git a/docs/get-started/ClickHouse.md b/docs/get-started/ClickHouse.md
@@ -89,10 +89,10 @@ git submodule update --init --recursive
 ##### build
 
 There are several ways to build the backend library.
-1. Build it direclty
+1. Build it directly
 
 
-If you have setup all requirements, you can use following command to build it direclty.
+If you have setup all requirements, you can use following command to build it directly.
 
 ```bash
 cd $gluten_root
@@ -340,7 +340,7 @@ You need to add these additional configs to spark:
   --config spark.hadoop.fs.s3a.access.key=YOUR_ACCESS_KEY
   --config spark.hadoop.fs.s3a.secret.key=YOUR_SECRET_KEY
 ```
-where S3_ENDPOINT must follow the format of `https://s3.region-code.amazonaws.com`, e.g. `https://s3.us-east-1.amazonaws.com` (or `http://hostname:39090 for MINIO)
+where S3_ENDPOINT must follow the format of `https://s3.region-code.amazonaws.com`, e.g. `https://s3.us-east-1.amazonaws.com` (or `http://hostname:39090` for MINIO)
 
 When you query the parquet files in S3, you need to add the prefix `s3a://` to the path, e.g. `s3a://your_bucket_name/path_to_your_parquet`.
 

diff --git a/docs/get-started/VeloxGCS.md b/docs/get-started/VeloxGCS.md
@@ -10,7 +10,7 @@ Object stores offered by CSPs such as GCS are important for users of Gluten to s
 
 ## Installing the gcloud CLI
 
-To access GCS Objects using Gluten and Velox, first you have to [download an install the gcloud CLI] (https://cloud.google.com/sdk/docs/install).
+To access GCS Objects using Gluten and Velox, first you have to [download and install the gcloud CLI](https://cloud.google.com/sdk/docs/install).
 
 
 ## Configuring GCS using a user account
@@ -22,7 +22,7 @@ After these steps, no specific configuration is required for Gluten, since the a
 ## Configuring GCS using a credential file
 
 For workloads that need to be fully automated, manually authorizing can be problematic. For such cases it is better to use a json file with the credentials.
-This is described in the [instructions to configure a service account]https://cloud.google.com/sdk/docs/authorizing#service-account.
+This is described in the [instructions to configure a service account](https://cloud.google.com/sdk/docs/authorizing#service-account).
 
 Such json file with the credentials can be passed to Gluten:
 

diff --git a/docs/velox-configuration.md b/docs/velox-configuration.md
@@ -15,7 +15,7 @@ nav_order: 16
 | spark.gluten.sql.columnar.backend.velox.SplitPreloadPerDriver                    | 2                 | The split preload per task                                                                                                                                                                                                                                                                                                                                                                                                                            |
 | spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct          | 90                | If partial aggregation aggregationPct greater than this value, partial aggregation may be early abandoned. Note: this option only works when flushable partial aggregation is enabled. Ignored when spark.gluten.sql.columnar.backend.velox.flushablePartialAggregation=false.                                                                                                                                                                        |
 | spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows         | 100000            | If partial aggregation input rows number greater than this value,  partial aggregation may be early abandoned. Note: this option only works when flushable partial aggregation is enabled. Ignored when spark.gluten.sql.columnar.backend.velox.flushablePartialAggregation=false.                                                                                                                                                                    |
-| spark.gluten.sql.columnar.backend.velox.asyncTimeoutOnTaskStopping               | 30000ms           | Timeout for asynchronous execution when task is being stopped in Velox backend. It's recommended to set to a number larger than network connection timeout that the possible aysnc tasks are relying on.                                                                                                                                                                                                                                              |
+| spark.gluten.sql.columnar.backend.velox.asyncTimeoutOnTaskStopping               | 30000ms           | Timeout for asynchronous execution when task is being stopped in Velox backend. It's recommended to set to a number larger than network connection timeout that the possible async tasks are relying on.                                                                                                                                                                                                                                              |
 | spark.gluten.sql.columnar.backend.velox.cacheEnabled                             | false             | Enable Velox cache, default off. It's recommended to enablesoft-affinity as well when enable velox cache.                                                                                                                                                                                                                                                                                                                                             |
 | spark.gluten.sql.columnar.backend.velox.cachePrefetchMinPct                      | 0                 | Set prefetch cache min pct for velox file scan                                                                                                                                                                                                                                                                                                                                                                                                        |
 | spark.gluten.sql.columnar.backend.velox.checkUsageLeak                           | true              | Enable check memory usage leak.                                                                                                                                                                                                                                                                                                                                                                                                                       |
@@ -24,7 +24,7 @@ nav_order: 16
 | spark.gluten.sql.columnar.backend.velox.cudf.enableValidation                    | true              | Heuristics you can apply to validate a cuDF/GPU plan and only offload when the entire stage can be fully and profitably executed on GPU                                                                                                                                                                                                                                                                                                               |
 | spark.gluten.sql.columnar.backend.velox.cudf.memoryPercent                       | 50                | The initial percent of GPU memory to allocate for memory resource for one thread.                                                                                                                                                                                                                                                                                                                                                                     |
 | spark.gluten.sql.columnar.backend.velox.cudf.memoryResource                      | async             | GPU RMM memory resource.                                                                                                                                                                                                                                                                                                                                                                                                                              |
-| spark.gluten.sql.columnar.backend.velox.cudf.shuffleMaxPrefetchBytes             | 1028MB            | Maximum bytes to prefetch in CPU memory during GPU shuffle read while waitingfor GPU available.                                                                                                                                                                                                                                                                                                                                                       |
+| spark.gluten.sql.columnar.backend.velox.cudf.shuffleMaxPrefetchBytes             | 1028MB            | Maximum bytes to prefetch in CPU memory during GPU shuffle read while waiting for GPU available.                                                                                                                                                                                                                                                                                                                                                       |
 | spark.gluten.sql.columnar.backend.velox.directorySizeGuess                       | 32KB              | Deprecated, rename to spark.gluten.sql.columnar.backend.velox.footerEstimatedSize                                                                                                                                                                                                                                                                                                                                                                     |
 | spark.gluten.sql.columnar.backend.velox.enableTimestampNtzValidation             | true              | Enable validation fallback for TimestampNTZ type. When true (default), any plan containing TimestampNTZ will fall back to Spark execution. Set to false during development/testing of TimestampNTZ support to allow native execution.                                                                                                                                                                                                                 |
 | spark.gluten.sql.columnar.backend.velox.fileHandleCacheEnabled                   | false             | Disables caching if false. File handle cache should be disabled if files are mutable, i.e. file content may change while file path stays the same.                                                                                                                                                                                                                                                                                                    |
@@ -78,7 +78,7 @@ nav_order: 16
 | spark.gluten.sql.enable.enhancedFeatures                                         | true              | Enable some features including iceberg native write and other features.                                                                                                                                                                                                                                                                                                                                                                               |
 | spark.gluten.sql.rewrite.castArrayToString                                       | true              | When true, rewrite `cast(array as String)` to `concat('[', array_join(array, ', ', null), ']')` to allow offloading to Velox.                                                                                                                                                                                                                                                                                                                         |
 | spark.gluten.velox.broadcast.build.targetBytesPerThread                          | 32MB              | It is used to calculate the number of hash table build threads. Based on our testing across various thresholds (1MB to 128MB), we recommend a value of 32MB or 64MB, as these consistently provided the most significant performance gains.                                                                                                                                                                                                           |
-| spark.gluten.velox.castFromVarcharAddTrimNode                                    | false             | If true, will add a trim node which has the same sementic as vanilla Spark to CAST-from-varchar.Otherwise, do nothing.                                                                                                                                                                                                                                                                                                                                |
+| spark.gluten.velox.castFromVarcharAddTrimNode                                    | false             | If true, will add a trim node which has the same semantic as vanilla Spark to CAST-from-varchar.Otherwise, do nothing.                                                                                                                                                                                                                                                                                                                                |
 
 ## Gluten Velox backend *experimental* configurations
 

diff --git a/docs/velox-spark-configuration.md b/docs/velox-spark-configuration.md
@@ -2,7 +2,7 @@ layout: page
 title: Spark configurations status in Gluten Velox Backend
 nav_order: 17
 
-The file lists the if Spark configurations are hornored by Gluten velox backend or not. Table is from Spark4.0 configuration page. The status are:
+The file lists the if Spark configurations are honored by Gluten velox backend or not. Table is from Spark4.0 configuration page. The status are:
 - ✅ Supported<br>
 - ❌ Not Supported<br>
 - ⚠️ Partial Support<br>