Skip to content

feat: Integrate coolbpf cpu profiling feature in loongcollector#2391

Open
wokron wants to merge 90 commits intoalibaba:mainfrom
wokron:support-coolbpf-cpu-profiling
Open

feat: Integrate coolbpf cpu profiling feature in loongcollector#2391
wokron wants to merge 90 commits intoalibaba:mainfrom
wokron:support-coolbpf-cpu-profiling

Conversation

@wokron
Copy link
Copy Markdown

@wokron wokron commented Sep 18, 2025

No description provided.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Sep 18, 2025

CLA assistant check
All committers have signed the CLA.

@yyuuttaaoo yyuuttaaoo marked this pull request as ready for review November 6, 2025 02:17
Copilot AI review requested due to automatic review settings November 6, 2025 02:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates coolbpf CPU profiling capabilities into loongcollector by adding a new input_cpu_profiling plugin. The implementation enables continuous CPU profiling of specified processes through command-line pattern matching and container discovery.

Key changes:

  • New CPU profiling plugin with process discovery mechanism
  • Integration with coolbpf profiler library
  • Plugin registration and lifecycle management

Reviewed Changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
core/plugin/input/InputCpuProfiling.{h,cpp} New plugin implementation for CPU profiling input
core/ebpf/plugin/cpu_profiling/* Core CPU profiling manager and process discovery logic
core/ebpf/driver/CpuProfiler.h Wrapper for coolbpf profiler library integration
core/ebpf/Config.{h,cpp} CPU profiling configuration option handling
core/ebpf/include/export.h Type definitions for CPU profiling
core/unittest/input/InputCpuProfilingUnittest.cpp Unit tests for the input plugin
core/unittest/ebpf/*Unittest.cpp Unit tests for CPU profiling components
Various CMakeLists.txt Build system updates for new components
Comments suppressed due to low confidence (1)

core/unittest/input/InputCpuProfilingUnittest.cpp:1

  • Corrected spelling of 'CommandLines' to 'CommandLines' in comment context.

Comment on lines +136 to +138
static void handler_without_ctx(uint32_t pid, const char* comm, const char* stack, uint32_t cnt) {
mHandler(pid, comm, stack, cnt, mCtx);
}
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static members mHandler and mCtx accessed without synchronization in handler_without_ctx, which is called from Poll() while holding a lock, but could race with Start() and Stop() that modify these members. The callback could be invoked with stale or null pointer values.

Copilot uses AI. Check for mistakes.
Comment on lines +265 to +266
mEBPFAdapter->UpdatePlugin(PluginType::CPU_PROFILING,
buildCpuProfilingConfig(std::move(totalPids), std::nullopt, nullptr, nullptr));
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing null handler and context to buildCpuProfilingConfig during update will overwrite the valid handler set during initialization, breaking profiling event handling.

Copilot uses AI. Check for mistakes.
Comment thread core/ebpf/plugin/cpu_profiling/ProcessDiscoveryManager.cpp Outdated
Comment on lines +234 to +240
int maxRetry = 5;
for (int retry = 0; retry < maxRetry; ++retry) {
if (QueueStatus::OK == ProcessQueueManager::GetInstance()->PushQueue(info.mQueueKey, std::move(item))) {
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(100));
if (retry == maxRetry - 1) {
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Magic numbers 5 and 100 for retry attempts and sleep duration should be extracted as named constants or made configurable to improve maintainability and allow tuning.

Suggested change
int maxRetry = 5;
for (int retry = 0; retry < maxRetry; ++retry) {
if (QueueStatus::OK == ProcessQueueManager::GetInstance()->PushQueue(info.mQueueKey, std::move(item))) {
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(100));
if (retry == maxRetry - 1) {
for (int retry = 0; retry < kMaxQueuePushRetry; ++retry) {
if (QueueStatus::OK == ProcessQueueManager::GetInstance()->PushQueue(info.mQueueKey, std::move(item))) {
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(kQueuePushRetrySleepMs));
if (retry == kMaxQueuePushRetry - 1) {

Copilot uses AI. Check for mistakes.
Comment on lines +253 to +262
std::lock_guard guard(mMutex);
mRouter.clear();
for (auto& [configKey, pids] : result) {
for (auto& pid : pids) {
totalPids.insert(pid);
auto it = mRouter.emplace(pid, std::unordered_set<ConfigKey>{}).first;
auto& configSet = it->second;
configSet.insert(configKey);
}
}
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing mRouter in HandleProcessDiscoveryEvent creates a race condition with HandleCpuProfilingEvent which reads from mRouter. Events arriving between clear and rebuild could be lost or routed incorrectly.

Copilot uses AI. Check for mistakes.
Comment thread core/ebpf/driver/CpuProfiler.h Outdated
Comment on lines +170 to +172
// TODO: make this non-static
inline static livetrace_profiler_read_cb_ctx_t mHandler = nullptr;
inline static void* mCtx = nullptr;
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static member variables for instance-specific handler and context violate encapsulation and prevent multiple CpuProfiler instances from working correctly. This TODO should be addressed before production use.

Copilot uses AI. Check for mistakes.
@wokron wokron force-pushed the support-coolbpf-cpu-profiling branch from ecf5b83 to a3a8e56 Compare November 7, 2025 01:48
@wokron wokron force-pushed the support-coolbpf-cpu-profiling branch from c00d080 to 0f58800 Compare November 27, 2025 09:24
@wokron wokron force-pushed the support-coolbpf-cpu-profiling branch 5 times, most recently from 2a5948d to 6c51221 Compare December 22, 2025 02:21
@yyuuttaaoo yyuuttaaoo self-requested a review February 24, 2026 06:53
Comment thread core/container_manager/ContainerManager.cpp Outdated
Comment thread core/plugin/input/InputCpuProfiling.h
Comment thread core/plugin/input/InputCpuProfiling.cpp Outdated
Comment thread core/ebpf/plugin/cpu_profiling/ProcessDiscoveryManager.h Outdated
Comment thread core/ebpf/plugin/cpu_profiling/ProcessDiscoveryManager.h Outdated
Comment thread core/ebpf/plugin/cpu_profiling/CpuProfilingManager.cpp Outdated
Comment thread core/ebpf/plugin/cpu_profiling/CpuProfilingManager.cpp Outdated
Comment thread core/ebpf/plugin/cpu_profiling/CpuProfilingManager.cpp Outdated
Comment thread core/container_manager/ContainerManager.h Outdated
Comment thread core/container_manager/ContainerManager.cpp Outdated
@yyuuttaaoo
Copy link
Copy Markdown
Collaborator

使用文档仿照其他插件补充一下

Comment thread core/ebpf/plugin/cpu_profiling/CpuProfilingManager.cpp Outdated
Comment thread core/ebpf/driver/CpuProfiler.h Outdated
Comment thread core/ebpf/driver/CpuProfiler.h Outdated
std::lock_guard<std::mutex> lock(mMutex);
if (mProfiler == nullptr) {
livetrace_enable_tracing();
mProfiler = livetrace_profiler_create();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有没有控制队列大小等资源相关的参数,如何约束其资源使用量呢

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

队列大小的控制是 coolbpf 侧提供的。其中根据 profile 周期确定了有界队列的大小。https://gitee.com/anolis/coolbpf/blob/master/src/profiler/src/probes/probes.rs#L274

@wokron wokron force-pushed the support-coolbpf-cpu-profiling branch from 8f81b95 to 1ffef40 Compare March 3, 2026 10:03
@wokron wokron force-pushed the support-coolbpf-cpu-profiling branch from 37e01d9 to 8191f24 Compare March 4, 2026 03:00
@wokron
Copy link
Copy Markdown
Author

wokron commented Mar 6, 2026

使用文档仿照其他插件补充一下

文档已经补充了

@Takuka0311
Copy link
Copy Markdown
Collaborator

Takuka0311 commented Apr 27, 2026

百炼自动化审查:建议保持开启。

本 PR 为 LoongCollector 新增基于 coolbpf 的 CPU 性能剖析插件(input_cpu_profiling),包含核心 C++ 实现、eBPF 驱动适配、单元测试与中文文档。维护者已进行多轮代码审查,作者持续响应并修复问题。目前本 PR 因落后于 main 分支存在合并冲突,但功能实现完整且具有明确产品价值,属于活跃开发中的有效特性 PR。.

最佳落地路径:

建议作者 rebase 到最新 main 分支解决合并冲突,并继续跟进维护者审查意见。待 CI 通过且审查批准后由维护者合并至 main 分支。.

已核对内容:

百炼审查备注:模型 qwen3.6-max-preview;对照提交 7099f790b8a3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants