fix: use UTC consistently across all datetime operations to resolve timezone display issues#14192
fix: use UTC consistently across all datetime operations to resolve timezone display issues#14192euvre wants to merge 2 commits intoinfiniflow:mainfrom
Conversation
📝 WalkthroughWalkthroughThis PR standardizes timestamp handling across the codebase by converting local-time usages to UTC (e.g., Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (4)
api/apps/chunk_app.py (1)
328-329: Use one UTC value for both creation fields.This prevents intra-record timestamp drift between
create_timeandcreate_timestamp_flt.Proposed tweak
- d["create_time"] = str(datetime.datetime.utcnow()).replace("T", " ")[:19] - d["create_timestamp_flt"] = datetime.datetime.utcnow().timestamp() + now_utc = datetime.datetime.utcnow() + d["create_time"] = now_utc.strftime("%Y-%m-%d %H:%M:%S") + d["create_timestamp_flt"] = now_utc.timestamp()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/apps/chunk_app.py` around lines 328 - 329, Compute a single UTC datetime once and reuse it for both fields to avoid drift: assign now = datetime.datetime.utcnow() (or similar) and then set d["create_time"] = str(now).replace("T", " ")[:19] and d["create_timestamp_flt"] = now.timestamp(), referencing the existing dict d and the keys "create_time" and "create_timestamp_flt".api/db/services/document_service.py (2)
1006-1007: Capturedatetime.utcnow()once per emitted record/task.Paired fields (
progress_msg+begin_at,create_time+create_timestamp_flt) should come from the same instant for consistency.Proposed tweak
- "progress_msg": datetime.utcnow().strftime("%H:%M:%S") + " created task " + ty, - "begin_at": datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S"), + "progress_msg": now_utc.strftime("%H:%M:%S") + " created task " + ty, + "begin_at": now_utc.strftime("%Y-%m-%d %H:%M:%S"),- d["create_time"] = str(datetime.utcnow()).replace("T", " ")[:19] - d["create_timestamp_flt"] = datetime.utcnow().timestamp() + now_utc = datetime.utcnow() + d["create_time"] = now_utc.strftime("%Y-%m-%d %H:%M:%S") + d["create_timestamp_flt"] = now_utc.timestamp()Also applies to: 1082-1083
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/document_service.py` around lines 1006 - 1007, The code calls datetime.utcnow() separately for paired timestamp fields causing inconsistency; capture a single utcnow() into a local variable (e.g., now) at the start of the record/task creation and use that same value to derive both "progress_msg" and "begin_at" (and likewise for "create_time" and "create_timestamp_flt") so those pairs come from the same instant; update the blocks that set progress_msg/begin_at and create_time/create_timestamp_flt to use the single captured timestamp variable.
886-890: Use one captured UTC instant for duration updates.This avoids tiny skew between fallback
process_begin_atandprocess_durationin the same update cycle.Proposed tweak
- begin_at = d.get("process_begin_at") - if not begin_at: - begin_at = datetime.utcnow() + now_utc = datetime.utcnow() + begin_at = d.get("process_begin_at") + if not begin_at: + begin_at = now_utc # fallback cls.update_by_id(d["id"], {"process_begin_at": begin_at}) - info = {"process_duration": max(datetime.timestamp(datetime.utcnow()) - begin_at.timestamp(), 0), "run": status} + info = {"process_duration": max((now_utc - begin_at).total_seconds(), 0), "run": status}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/document_service.py` around lines 886 - 890, Capture a single UTC instant and reuse it for both the fallback process_begin_at update and as the start timestamp when computing process_duration: assign now = datetime.utcnow() (or reuse the existing begin_at variable but only set it once), call cls.update_by_id(d["id"], {"process_begin_at": begin_at_or_now}) with that captured instant, and compute info["process_duration"] using that same captured instant as the start (e.g. process_duration = max(datetime.timestamp(datetime.utcnow()) - begin_at_or_now.timestamp(), 0)); update the code paths around begin_at, cls.update_by_id, and the info dict to use this single captured timestamp.api/db/services/evaluation_service.py (1)
287-287: Make auto-generated run names timezone-explicit.After switching to UTC, this label can still be read as local time in UI/logs. Appending
UTCavoids ambiguity.Proposed tweak
- name = f"Evaluation Run {datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')}" + name = f"Evaluation Run {datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')} UTC"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/evaluation_service.py` at line 287, The auto-generated evaluation run name currently uses datetime.utcnow() but lacks timezone context; update the assignment that builds the name (the line setting the variable name to f"Evaluation Run {datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')}") to append an explicit timezone indicator (e.g., " UTC") so the resulting label is "Evaluation Run <timestamp> UTC" to remove ambiguity in UI/logs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@admin/server/auth.py`:
- Around line 182-183: The assignments to user.update_time and user.update_date
are creating one-element tuples because of trailing commas; change the
assignments in the auth code to assign the raw values returned by
current_timestamp() and datetime_format(datetime.utcnow()) (remove the trailing
commas) so that user.update_time and user.update_date receive the expected
scalar types before calling user.save().
In `@api/apps/sdk/doc.py`:
- Around line 955-956: The create_timestamp_flt assignment uses a naive datetime
(datetime.utcnow().timestamp()) which Python treats as local time and can be
offset on non-UTC hosts; update the code that sets d["create_timestamp_flt"] to
use a timezone-aware UTC datetime (e.g., create a datetime with
timezone=datetime.timezone.utc via datetime.now(datetime.timezone.utc) or
utcnow().replace(tzinfo=datetime.timezone.utc)) before calling .timestamp() so
the epoch is correct; locate d["create_timestamp_flt"] in api/apps/sdk/doc.py
and replace the naive timestamp call accordingly (you can also consider using
the same UTC-aware source for d["create_time"] to keep consistency).
In `@api/apps/system_app.py`:
- Around line 130-133: The heartbeat time calculation uses a naive UTC datetime
which can be misinterpreted on non-UTC servers; replace the naive call that sets
now in system_app.py with an aware UTC timestamp (e.g., use
datetime.now(timezone.utc).timestamp()) so
REDIS_CONN.zrangebyscore(task_executor_id, now - 60 * 30, now) queries the
correct epoch range; also add the required import for timezone from datetime if
not present.
In `@api/db/services/file2document_service.py`:
- Around line 75-79: The update_by_file_id method is filtering on the primary id
(cls.model.id) instead of the file mapping column; change the where clause to
use the file_id column (e.g., cls.model.file_id == file_id) so the update
targets the row identified by file_id, keep the existing timestamp updates
(obj["update_time"], obj["update_date"]) and the call to
cls.model.update(...).execute(), and return File2Document(**obj) as before.
In `@api/db/services/task_service.py`:
- Around line 345-346: The subtraction datetime.utcnow() - task.begin_at can
raise when task.begin_at is None; guard the duration calculation in the method
that sets process_duration by checking task.begin_at (or Task.begin_at) before
computing seconds and only compute/update process_duration when begin_at is
present, otherwise set process_duration to None or 0 as appropriate and call
cls.model.update(...) with that value (use the same id/cls.model.id selector);
ensure you reference task.begin_at, process_duration, datetime.utcnow(),
cls.model.update and id when making the change.
In `@api/utils/health_utils.py`:
- Around line 312-315: The health window computation uses
datetime.utcnow().timestamp() which can be misinterpreted on non-UTC hosts;
update the import to include timezone (from datetime import datetime, timezone)
and replace datetime.utcnow().timestamp() with
datetime.now(timezone.utc).timestamp() where `now` is computed (used by
`REDIS_CONN.zrangebyscore` for `task_executors` heartbeat queries) so the Redis
score range is based on an explicit UTC timestamp.
In `@common/time_utils.py`:
- Around line 50-53: In timestamp_to_date: treat 0 as a valid timestamp by
changing the check to "if timestamp is None: timestamp = time.time()", and stop
unconditionally dividing by 1000; instead detect milliseconds vs seconds (e.g.,
if timestamp > 1e12 then timestamp = timestamp / 1000) so time.time() (seconds)
isn't mis-scaled, then call time.gmtime(timestamp); update references to the
timestamp variable and use time.time() and time.gmtime() accordingly.
---
Nitpick comments:
In `@api/apps/chunk_app.py`:
- Around line 328-329: Compute a single UTC datetime once and reuse it for both
fields to avoid drift: assign now = datetime.datetime.utcnow() (or similar) and
then set d["create_time"] = str(now).replace("T", " ")[:19] and
d["create_timestamp_flt"] = now.timestamp(), referencing the existing dict d and
the keys "create_time" and "create_timestamp_flt".
In `@api/db/services/document_service.py`:
- Around line 1006-1007: The code calls datetime.utcnow() separately for paired
timestamp fields causing inconsistency; capture a single utcnow() into a local
variable (e.g., now) at the start of the record/task creation and use that same
value to derive both "progress_msg" and "begin_at" (and likewise for
"create_time" and "create_timestamp_flt") so those pairs come from the same
instant; update the blocks that set progress_msg/begin_at and
create_time/create_timestamp_flt to use the single captured timestamp variable.
- Around line 886-890: Capture a single UTC instant and reuse it for both the
fallback process_begin_at update and as the start timestamp when computing
process_duration: assign now = datetime.utcnow() (or reuse the existing begin_at
variable but only set it once), call cls.update_by_id(d["id"],
{"process_begin_at": begin_at_or_now}) with that captured instant, and compute
info["process_duration"] using that same captured instant as the start (e.g.
process_duration = max(datetime.timestamp(datetime.utcnow()) -
begin_at_or_now.timestamp(), 0)); update the code paths around begin_at,
cls.update_by_id, and the info dict to use this single captured timestamp.
In `@api/db/services/evaluation_service.py`:
- Line 287: The auto-generated evaluation run name currently uses
datetime.utcnow() but lacks timezone context; update the assignment that builds
the name (the line setting the variable name to f"Evaluation Run
{datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')}") to append an explicit
timezone indicator (e.g., " UTC") so the resulting label is "Evaluation Run
<timestamp> UTC" to remove ambiguity in UI/logs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fd843fcc-6410-48c9-b685-3bee288dc65c
📒 Files selected for processing (24)
admin/server/auth.pyadmin/server/routes.pyapi/apps/api_app.pyapi/apps/chunk_app.pyapi/apps/restful_apis/system_api.pyapi/apps/sdk/doc.pyapi/apps/system_app.pyapi/apps/user_app.pyapi/db/services/api_service.pyapi/db/services/common_service.pyapi/db/services/connector_service.pyapi/db/services/dialog_service.pyapi/db/services/document_service.pyapi/db/services/evaluation_service.pyapi/db/services/file2document_service.pyapi/db/services/knowledgebase_service.pyapi/db/services/langfuse_service.pyapi/db/services/pipeline_operation_log_service.pyapi/db/services/search_service.pyapi/db/services/system_settings_service.pyapi/db/services/task_service.pyapi/db/services/user_service.pyapi/utils/health_utils.pycommon/time_utils.py
| if not timestamp: | ||
| timestamp = time.time() | ||
| timestamp = int(timestamp) / 1000 | ||
| time_array = time.localtime(timestamp) | ||
| time_array = time.gmtime(timestamp) |
There was a problem hiding this comment.
Fix default timestamp handling in timestamp_to_date.
On Line 51, time.time() returns seconds, but Line 52 divides by 1000 again, producing a 1970-era date. Also, 0 is incorrectly treated as “missing”.
Proposed fix
- if not timestamp:
- timestamp = time.time()
- timestamp = int(timestamp) / 1000
+ if timestamp in (None, ""):
+ timestamp = current_timestamp()
+ timestamp = int(timestamp) / 1000🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@common/time_utils.py` around lines 50 - 53, In timestamp_to_date: treat 0 as
a valid timestamp by changing the check to "if timestamp is None: timestamp =
time.time()", and stop unconditionally dividing by 1000; instead detect
milliseconds vs seconds (e.g., if timestamp > 1e12 then timestamp = timestamp /
1000) so time.time() (seconds) isn't mis-scaled, then call
time.gmtime(timestamp); update references to the timestamp variable and use
time.time() and time.gmtime() accordingly.
Signed-off-by: noob <yixiao121314@outlook.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/db/services/document_service.py`:
- Around line 1084-1086: now_utc is created as a naive datetime via
datetime.utcnow() so calling .timestamp() will be interpreted as local time on
non-UTC hosts; change creation of now_utc to an aware UTC datetime (e.g.
datetime.now(timezone.utc) or datetime.utcnow().replace(tzinfo=timezone.utc))
before setting d["create_time"] and d["create_timestamp_flt"] so that
d["create_timestamp_flt"] is the true UTC epoch seconds for the document.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 35941532-4bf3-4aab-8321-8fd112a2d261
📒 Files selected for processing (10)
admin/server/auth.pyapi/apps/chunk_app.pyapi/apps/sdk/doc.pyapi/apps/system_app.pyapi/db/services/document_service.pyapi/db/services/evaluation_service.pyapi/db/services/file2document_service.pyapi/db/services/task_service.pyapi/utils/health_utils.pycommon/time_utils.py
✅ Files skipped from review due to trivial changes (3)
- api/db/services/evaluation_service.py
- api/apps/sdk/doc.py
- api/apps/system_app.py
🚧 Files skipped from review as they are similar to previous changes (5)
- admin/server/auth.py
- api/utils/health_utils.py
- api/apps/chunk_app.py
- api/db/services/task_service.py
- api/db/services/file2document_service.py
| now_utc = datetime.utcnow() | ||
| d["create_time"] = now_utc.strftime("%Y-%m-%d %H:%M:%S") | ||
| d["create_timestamp_flt"] = now_utc.timestamp() |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Python 3, does datetime.timestamp() interpret naive datetime objects as local time, and is datetime.now(timezone.utc) the recommended UTC-safe approach over datetime.utcnow() for epoch conversion?
💡 Result:
Yes, in Python 3, datetime.timestamp interprets naive datetime objects as local time. For UTC-safe epoch conversion, datetime.now(timezone.utc) is the recommended approach over the deprecated datetime.utcnow.
Citations:
- 1: https://docs.python.org/library/datetime.html
- 2: https://docs.python.org/3/library/datetime.html
- 3: https://docs.python.org/3/library/datetime
- 4: https://blog.ganssle.io/articles/2022/04/naive-local-datetimes.html
- 5: https://runebook.dev/en/docs/python/library/datetime/datetime.datetime.timestamp
- 6: https://stackoverflow.com/questions/7065164/how-to-make-a-datetime-object-aware-not-naive
- 7: https://blog.miguelgrinberg.com/post/it-s-time-for-a-change-datetime-utcnow-is-now-deprecated
- 8: https://discuss.python.org/t/why-is-datetime-utcnow-deprecated/86868
- 9: https://stackoverflow.com/questions/32517248/what-is-the-difference-between-python-functions-datetime-now-and-datetime-t
- 10: https://stackoverflow.com/questions/42433171/when-to-use-datetime-utcnow-or-datetime-nowtz-pytz-utc-replacetzinfo-none
🏁 Script executed:
# First, verify the file exists and check lines around 1084-1086
head -1090 api/db/services/document_service.py | tail -20Repository: infiniflow/ragflow
Length of output: 989
🏁 Script executed:
# Check imports at the top of the file
head -50 api/db/services/document_service.pyRepository: infiniflow/ragflow
Length of output: 1993
🏁 Script executed:
# Search for all datetime imports in the file
rg "from datetime import|import datetime" api/db/services/document_service.pyRepository: infiniflow/ragflow
Length of output: 91
🏁 Script executed:
rg "datetime\.utcnow" api/db/services/document_service.py -nRepository: infiniflow/ragflow
Length of output: 195
🏁 Script executed:
# Check context around line 884
sed -n '880,890p' api/db/services/document_service.pyRepository: infiniflow/ragflow
Length of output: 551
🏁 Script executed:
# Check context around line 1001
sed -n '998,1008p' api/db/services/document_service.pyRepository: infiniflow/ragflow
Length of output: 448
create_timestamp_flt is incorrect with naive utcnow() + .timestamp().
At Line 1086, now_utc is naive (datetime.utcnow()), and .timestamp() treats naive datetimes as local time. On non-UTC servers this stores an offset epoch value, which breaks the UTC-fix objective.
Proposed fix
-from datetime import datetime
+from datetime import datetime, timezone
@@
- now_utc = datetime.utcnow()
+ now_utc = datetime.now(timezone.utc)
d["create_time"] = now_utc.strftime("%Y-%m-%d %H:%M:%S")
d["create_timestamp_flt"] = now_utc.timestamp()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/db/services/document_service.py` around lines 1084 - 1086, now_utc is
created as a naive datetime via datetime.utcnow() so calling .timestamp() will
be interpreted as local time on non-UTC hosts; change creation of now_utc to an
aware UTC datetime (e.g. datetime.now(timezone.utc) or
datetime.utcnow().replace(tzinfo=timezone.utc)) before setting d["create_time"]
and d["create_timestamp_flt"] so that d["create_timestamp_flt"] is the true UTC
epoch seconds for the document.
|
I'll open this pr after investigation. |
What problem does this PR solve?
Description
Fixes #6970
API responses returned incorrect times when the server timezone was not UTC. Date strings like
create_dateandupdate_datewere generated using local time (e.g., UTC+8) but displayed without timezone context, causing an offset of several hours between the displayed and actual time.Root Cause
timestamp_to_date()usedtime.localtime()instead oftime.gmtime()date_string_to_timestamp()usedtime.mktime()(local-tz) instead ofcalendar.timegm()(UTC)get_format_time()and all service layers useddatetime.now()instead ofdatetime.utcnow()datetime_format()did not handle timezone-aware inputsThese caused a mismatch: timestamps were correct UTC values, while the corresponding date strings reflected server local time.
Changes
Core time utilities (
common/time_utils.py)timestamp_to_date()time.localtime()time.gmtime()date_string_to_timestamp()time.mktime()calendar.timegm()datetime_format()get_format_time()datetime.now()datetime.utcnow()delta_seconds()datetime.now()datetime.utcnow()Service & API layer (23 files)
Replaced all
datetime.now()withdatetime.utcnow()across:api/db/services/— common_service, user_service, dialog_service, api_service, langfuse_service, search_service, file2document_service, knowledgebase_service, system_settings_service, pipeline_operation_log_service, task_service, connector_service, document_service, evaluation_serviceapi/apps/— user_app, api_app, system_app, chunk_app, restful_apis/system_api, sdk/docapi/utils/— health_utilsadmin/server/— auth, routesTesting
datetime.now()occurrences inapi/andadmin/have been replaced — verified withgrep.timestamp_to_date()→date_string_to_timestamp()now uses UTC consistently on both ends.Type of change