Welcome to the first article in our Agentic AI Security Series, where Unit 221B examines the security implications of AI-powered development tools. Our comprehensive technical analysis of Trae, ByteDance's AI coding assistant, reveals a sophisticated telemetry architecture operating within the application. While offering free access to Claude 3.7 Sonnet and GPT-4o, the IDE implements an extensive data collection system with multiple communication channels, persistent tracking capabilities, and comprehensive monitoring features. This analysis documents the technical components of this telemetry infrastructure, its network behavior, and the implications for developers considering such tools for their workflows.
Trae version 1.0.10282 for macOS
Trae is an adaptive AI IDE developed by ByteDance that transforms how developers work. Launched in early 2025, it positions itself as a revolutionary coding environment that collaborates with developers to ship faster. The application is distributed through ByteDance's Singapore-based subsidiary SPRING(SG)PTE.LTD.
According to Trae's official descriptions, the platform integrates two major international models - Claude 3.7 Sonnet and GPT-4o - both currently available for free. Trae is designed to compete functionally with products like Cursor and GitHub Copilot while offering a more accessible experience, especially for Chinese-speaking developers, with its bilingual interface supporting both English and Simplified Chinese.
Like other AI-powered IDEs, Trae is built on Microsoft's Visual Studio Code, allowing users to directly migrate plugins and settings from VS Code or Cursor for a seamless transition. The platform is available for both macOS and Windows. This analysis was conducted on the macOS version only. While we haven't analyzed the Windows version, we can safely assume the telemetry metrics will be cross-compatible between platforms, though some implementation differences may occur due to platform-specific requirements.
Trae highlights several key features in its promotional materials:
ByteDance positions Trae as democratizing programming for developers of all skill levels - emphasizing that even those with minimal coding experience can build functional applications with its assistance.
Trae has rapidly emerged as a formidable competitor to established AI coding assistants like Cursor and GitHub Copilot. Its main selling point? It's completely free - offering Claude 3.7 Sonnet and GPT-4o without any subscription fees. Unit 221B's technical analysis, using network traffic interception, binary analysis, and runtime monitoring, has identified a sophisticated telemetry framework that continuously transmits data to multiple ByteDance servers. From a cybersecurity perspective, this represents a complex data collection operation with significant security and privacy implications.
Key Findings:
The "Free" Claude and GPT-4o integration includes a sophisticated telemetry framework that continuously transmits data. Our investigation into Trae serves as a case study in how ostensibly free AI developer tools operate as sophisticated data collection systems. The level of instrumentation we've uncovered rivals enterprise-grade telemetry platforms (see definition below), with persistent tracking capabilities that survive even application reinstallation.
Note: In this analysis, we use the term "enterprise-grade telemetry" to describe data collection systems that exhibit characteristics typically found in corporate software: (1) architecturally complex with multiple specialized endpoints, (2) employing data segregation across different categories, (3) implementing redundant collection pathways, (4) featuring centralized management capabilities, (5) utilizing persistent device tracking, and (6) maintaining scheduled, consistent reporting intervals. These characteristics distinguish such systems from simpler telemetry implementations common in consumer applications.
For security professionals and development teams handling sensitive intellectual property, understanding these hidden data flows is critical when evaluating AI coding tools. The techniques documented here reflect industry-wide patterns that merit close attention in any secure development environment.
Our analysis reveals a multi-layered approach to data collection, with specialized endpoints for different types of telemetry data. The system employs a distributed architecture that segments data collection across domains:
Domain | Primary Function | Data Type | Protocol |
---|---|---|---|
mon-va.byteoversea.com | Primary telemetry collection | Application state, user behavior, performance metrics | HTTPS (POST) |
maliva-mcs.byteoversea.com | Configuration and heartbeat | System status, feature flags, configuration | HTTPS (POST) |
api.trae.ai | Core API services | Device registration, configuration queries | HTTPS (GET/POST) |
api-sg-central.trae.ai | Regional API services | Regional backend interactions, device logs | HTTPS (POST) |
bytegate-sg.byteintlapi.com | Feature gate management | Feature flags, workspace configuration | HTTPS (POST) |
lf3-static.bytednsdoc.com | Static resource delivery | Control URLs, configuration data | HTTPS (GET) |
Through network traffic analysis, Unit 221B's research team captured and analyzed the communication between Trae and ByteDance servers. The following patterns emerged:
Our traffic analysis confirms that Trae establishes and maintains persistent connections to ByteDance servers, even during periods of complete inactivity by the user. These connections use HTTPS POST requests to transmit compressed data in regular intervals:
// Network connection sample - observed during 20-minute monitoring session [18:32:26.799] Server connect mon-va.byteoversea.com:443 (23.43.85.213:443) [::1]:62515: POST https://mon-va.byteoversea.com/monitor_browser/collect/batch/?biz_id=marscode_nativeide_us << HTTP/2.0 204 No Content 0b // Subsequent connection occurs approximately 30 seconds later
Trae implements a permanent device identification system using a machine ID that persists across installations. This ID appears to be a cryptographic hash that is passed in all configuration and API requests:
[18:32:26.799] Server connect api.trae.ai:443 (23.43.85.213:443) [::1]:62515: GET https://api.trae.ai/icube/api/v1/native/config/query?machineId=[REDACTED_MACHINE_ID]… HTTP/2.0 << HTTP/2.0 200 OK 1.3k
The machineId parameter (truncated above) is a SHA-256 hash derived from hardware identifiers, creating a persistent fingerprint of the user's system that can be tracked across installations and sessions.
Trae employs a geographically distributed server infrastructure, with region-specific endpoints for different functions. For example, we observed connections to Singapore-based servers for API requests:
[18:32:31.122] Server connect api-sg-central.trae.ai:443 (23.43.85.219:443) [::1]:62519: POST https://api-sg-central.trae.ai/icube/api/v1/device/log/check HTTP/2.0 << HTTP/2.0 200 OK 61b
This distributed architecture allows ByteDance to segregate data collection by region while maintaining centralized control through their global infrastructure.
In addition to standard HTTPS traffic, our analysis identified persistent WebSocket connections established locally, likely for inter-process communication or real-time feature updates. Deep analysis of these connections revealed the following patterns of internal data handling:
AI Completion Endpoint: ws://127.0.0.1:51000/module/aicompletion/0
This connection uses the Language Server Protocol (LSP) format with JSON payloads for AI code completion. Key data flows include:
teaConfig
object containing analytics configuration.textDocument/didOpen
events and again with every change via textDocument/didChange
, essentially logging all code edits in their entirety.$/ping
/$/pong
messages every 20 seconds to maintain awareness of active editing sessions.The volume and sensitivity of data flowing through this local channel closely parallels the external telemetry patterns, suggesting an integrated collection strategy.
Manager Endpoint: ws://127.0.0.1:51000/manager/
This channel uses binary MessagePack format (WebSocket opcode 2) to coordinate a complex microservice architecture running locally:
ai-agent
, ckg
, aicompletion
) running on specific local ports.update_snapshot
requests to the ai-agent
service containing complete file content packaged as "snapshots," labeled with created_by: "ai"
, effectively creating a second pathway for full code content to move through the system.The use of binary MessagePack format provides a layer of obfuscation compared to the plain JSON on the AI completion channel, potentially making manual inspection more difficult.
# Authentication data sent over AI completion WebSocket { "userInfo": { "name": "[REDACTED_USERNAME]", "token": "[REDACTED_JWT_TOKEN]", "region": "US", "is_internal": false, "user_id": "[REDACTED_USER_ID]" }, "authInfo": { "jwtTokenType": "Cloud-IDE-JWT" }, "teaConfig": { "icube_uid": "[REDACTED_USER_ID]", "user_id": "[REDACTED_USER_ID]", "biz_user_id": "[REDACTED_USER_ID]", "user_is_login": true, "device_id": "[REDACTED_DEVICE_ID]", "machine_id": "[REDACTED_MACHINE_ID]", "arch": "arm64", "system": "darwin", "build_version": "1.0.10282", "region": "US" } } # Code content sent over manager WebSocket (MessagePack decoded) { "service": "snapshot", "method": "update_snapshot", "data": { "snapshot_diff_data": { "file_infos": [{ "file_path": "/[REDACTED_PATH]/src/utils/logger.js", "current_content": "", "new_content": "import fs from 'fs';\nimport path from 'path';\nimport os from 'os';\n\n// Define log levels\nconst LOG_LEVELS = {\n ERROR: 0,\n WARN: 1,\n...[FULL FILE CONTENT]", "created_by": "ai", "file_action": "added" }] }, "project_id": "[REDACTED_PROJECT_ID]", "chat_session_id": "[REDACTED_SESSION_ID]" } }
This internal architecture documents how ByteDance maintains a system for tracking user activity, code content, and system information. The local WebSocket traffic illustrates how data flows from user editing sessions through internal channels.
Security Consideration: The confirmed internal movement of full document content through multiple channels establishes that the application processes complete file contents locally. While we can directly observe code flowing through these internal channels, the external telemetry's compressed or encrypted nature makes it difficult to conclusively verify whether any of this data is transmitted to remote servers. The transmission of authentication tokens (JWT) and credentials through multiple local channels creates additional attack vectors for potential credential interception. The duplicate transmission of file contents through separate channels represents a design choice that may impact both performance and security models.
Through binary analysis and decompilation of the Trae application, we identified ByteDance's proprietary telemetry framework implemented as part of the Electron application. The primary components include:
@byted-icube/slardar
- Core telemetry collection system1
@byted-icube/tea
- User behavior analytics
teaConfig
object@byted/device-register
- Persistent device tracking
1Note: ByteDance's Slardar telemetry framework has been independently documented in security research as a known component of ByteDance applications, where it's described as enabling "remote configuration, feature flagging, and policy enforcement." Similar telemetry mechanisms have been observed in multiple ByteDance products.
The telemetry system is deeply integrated into Trae's Electron runtime, as evidenced by this startup log showing the command line parameters and initialization sequence:
[main 2025-03-30T22:46:40.370Z] ICUBE:update scheduleCheckForUpdates, updateInterval -> 60 [18:46:40.370][127.0.0.1:64408] server connect mon-va.byteoversea.com:443 (147.160.190.227:443) 127.0.0.1:64408: POST https://mon-va.byteoversea.com/monitor_browser/collect/batch/
Our network analysis revealed the following server infrastructure supporting Trae's telemetry system:
Domain | IP Addresses | Location | Provider |
---|---|---|---|
mon-va.byteoversea.com | 147.160.190.227 147.160.190.228 71.18.74.198 71.18.1.198 |
United States (Virginia) | Akamai Edge Network |
maliva-mcs.byteoversea.com | 184.25.58.58 | United States | Akamai Edge Network |
api.trae.ai | 23.43.85.213 23.43.85.216 |
United States | Akamai Edge Network |
api-sg-central.trae.ai | 23.43.85.219 | Singapore | Akamai Edge Network |
ByteDance leverages Akamai's global edge network to distribute their telemetry collection infrastructure, allowing them to potentially manage data flows across multiple jurisdictions.2 This relationship has been documented in network traffic analysis showing ByteDance services routing through Akamai's content delivery network infrastructure.
2Note: The relationship between ByteDance and Akamai has been confirmed through independent network analysis. In 2024, research from Kentik (reported by Data Center Dynamics) documented TikTok traffic shifting to "third-party CDNs provided by vendors such as Akamai and Fastly," establishing the business relationship between these companies.
Our traffic capture revealed the frequency of network connections. The following timeline illustrates the regular pattern of telemetry transmissions:
18:37:09.878 - Telemetry POST to mon-va.byteoversea.com 18:37:39.930 - Server disconnect 18:37:44.366 - Server reconnect, new telemetry POST 18:38:41.014 - Server disconnect 18:39:10.110 - Server reconnect, new telemetry POST 18:39:40.872 - Server disconnect 18:40:09.880 - Server reconnect, new telemetry POST 18:41:07.887 - Server disconnect 18:41:09.880 - Server reconnect, new telemetry POST
This pattern shows Trae connecting to telemetry servers approximately every 30 seconds, even during periods of complete inactivity. Each connection involves a POST request to the telemetry endpoint, indicating regular data transmission regardless of user activity.
Our analysis uncovered ByteDance's sophisticated feature gate system that controls Trae's functionality remotely:
[18:32:43.459] Server connect bytegate-sg.byteintlapi.com:443 (23.43.85.219:443) [::1]:62575: POST https://bytegate-sg.byteintlapi.com/api/v1/workspace/feature_gates/values HTTP/2.0 << HTTP/2.0 200 OK 805b [::1]:62573: GET https://lf3-static.bytednsdoc.com/obj/eden-cn/lkpkbvsj/ljhwZthlaukjlkulzlp/marketplace/controlUrl.json HTTP/2.0 << HTTP/2.0 200 OK 11.3k
This system allows ByteDance to remotely enable or disable features, potentially targeting specific regions, users, or workspaces. The control infrastructure provides centralized management of the application's behavior, allowing ByteDance to modify functionality without pushing updates.
Interestingly, Trae also connects to Microsoft telemetry services, potentially as part of its Visual Studio Code core:
[18:46:41.393] Server connect mobile.events.data.microsoft.com:443 (20.189.173.18:443) 127.0.0.1:64412: POST https://mobile.events.data.microsoft.com/OneCollector/1.0?cors=true&content-type=application/x-json-stream << 200 OK 9b
This creates a scenario where user information may be subject to the telemetry policies of both ByteDance and Microsoft. This dual-layer architecture is consistent with how VSCode-based applications typically operate, as documented by Microsoft, but with the important distinction that ByteDance has added its own extensive telemetry layer on top of Microsoft's baseline collection.
Privacy Consideration: The combination of ByteDance telemetry and Microsoft telemetry creates a multi-layered tracking architecture that may exceed data minimization expectations in some privacy frameworks. While Microsoft publicly commits that user data "is not used to train foundation models," we found no equivalent public commitments from ByteDance regarding limitations on data use from Trae.
The extensive data collection infrastructure in Trae can be analyzed in terms of common technology business models that leverage user data:
This approach of offering free tools with extensive telemetry capabilities reflects common practices in the technology industry, where data collection often forms part of the value exchange for free services.
Our technical analysis identified the primary API endpoints used by Trae and their functions:
Endpoint | Function | Data Transmitted |
---|---|---|
/monitor_browser/collect/batch/ |
Telemetry collection | Application state, user actions, performance metrics |
/icube/api/v1/native/config/query |
Configuration retrieval | Machine ID, application state, version information |
/icube/api/v1/device/log/check |
Device logging status | Device information, log configuration |
/api/v1/workspace/feature_gates/values |
Feature gate configuration | Workspace information, user context |
/api/sdk/check_update |
Update checks | Current version, build ID, user ID, platform |
ws://127.0.0.1:51000/module/aicompletion/0 |
AI completion WebSocket | Full file contents, user credentials, system information, editing activity |
ws://127.0.0.1:51000/manager/ |
Internal service manager | MessagePack-encoded snapshots with file contents, credentials, routing between internal services |
The API structure reveals a sophisticated platform designed for comprehensive monitoring and remote control. The WebSocket endpoints further reveal how data flows internally between components before potentially being aggregated for external transmission.
For security teams looking to identify and monitor Trae's telemetry activities within their enterprise networks, Unit 221B has compiled the following Tactics, Techniques, and Procedures (TTPs) based on our analysis:
Category | Indicators | Detection Guidance |
---|---|---|
Domain Patterns |
|
Configure network monitoring to flag connections to these domain patterns. While some legitimate ByteDance services may use these domains, in corporate environments without approved ByteDance applications, these connections may indicate unauthorized software. |
Specific Endpoints |
|
These specific domains are directly associated with Trae's telemetry infrastructure. Monitor or block these endpoints in environments with stringent data security requirements. |
IP Addresses |
|
While these IPs are associated with Akamai's edge network and may change, persistent connections to these ranges combined with other indicators may help identify the application. |
Behavior | Detection Method |
---|---|
Regular 30-second POST intervals to ByteDance domains | Monitor for cyclical HTTP POST requests occurring approximately every 30 seconds to the domains listed above, particularly to mon-va.byteoversea.com/monitor_browser/collect/batch/ endpoints. |
Persistent connections during idle periods | Look for sustained network connections to telemetry endpoints even when workstations are idle, particularly distinguishing from regular background processes. |
204 No Content responses | The telemetry endpoints often respond with HTTP 204 (No Content) status codes, especially from the monitor_browser/collect/batch endpoint. |
Multiple parallel connections | Trae establishes connections to multiple ByteDance domains simultaneously, creating a distinctive network signature of parallel connections to different ByteDance infrastructure. |
Local WebSocket traffic on port 51000 | For endpoint monitoring solutions, watch for local WebSocket connections on port 51000, particularly to ws://127.0.0.1:51000/module/aicompletion/0 and ws://127.0.0.1:51000/manager/ |
POST https://mon-va.byteoversea.com/monitor_browser/collect/batch/?biz_id=marscode_nativeide_us HTTP/2 [Headers] content-type: application/json user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 KHTML, like Gecko) Trae/1.0.10282 Chrome/110.0.5481.52 Electron/26.6.0 Safari/537.36 [Body contains compressed JSON]
GET https://api.trae.ai/icube/api/v1/native/config/query?machineId=[HASH]&platform=darwin&version=1.0.10282&language=en-US HTTP/2 [Headers] user-agent: Trae/1.0.10282 Electron/26.6.0 accept: application/json
For organizations looking to restrict Trae's telemetry, consider implementing the following firewall rules:
# Block primary telemetry domains block domain *.byteoversea.com block domain *.trae.ai block domain *.byteintlapi.com block domain *.bytednsdoc.com # Block specific high-value endpoints block domain mon-va.byteoversea.com block domain maliva-mcs.byteoversea.com block domain api.trae.ai block domain api-sg-central.trae.ai block domain bytegate-sg.byteintlapi.com
For Security Information and Event Management (SIEM) systems, implement the following detection logic:
# Pseudo-code for SIEM rule rule "Trae Telemetry Detection" { events: $e1 = network_connection( destination_domain MATCHES "*.byteoversea.com" OR destination_domain MATCHES "*.trae.ai" OR destination_domain MATCHES "*.byteintlapi.com" ) timewindow: 2 minutes condition: # Look for the characteristic 30-second interval pattern count($e1) >= 3 AND # Check for consistent time gaps between events (max_time_between($e1) <= 40 seconds AND min_time_between($e1) >= 25 seconds) actions: alert("Potential Trae telemetry traffic detected from " + $e1.source_ip) }
For endpoint detection systems, monitor for these process and file indicators:
Trae
or Trae Helper
/Applications/Trae.app/Contents/MacOS/Trae
~/Library/Application Support/Trae/
To prevent potential data leakage through Trae, security teams should consider:
These TTPs should help security teams effectively detect and monitor Trae's telemetry activities within their networks, enabling informed decisions about the use of such tools in their environments based on their specific security requirements and risk tolerance.
Our technical analysis of Trae documents an application with comprehensive telemetry capabilities integrated throughout its architecture. The application provides Claude 3.7 Sonnet and GPT-4o AI features while maintaining regular network connections to ByteDance servers via multiple endpoints.
The regular cadence of these connections and the variety of data categories being collected indicates a significant investment in telemetry infrastructure, likely supporting both product improvement and user analytics objectives. This approach reflects common practices in free AI tools where data collection often forms part of the underlying business model.
From a security perspective, Trae represents a case study in how modern AI development tools can function as sophisticated data collection platforms. The multi-layered telemetry architecture, redundant data flows, and persistent tracking capabilities demonstrate advanced techniques that security professionals should be aware of when evaluating tools for sensitive development environments.
The use of AI coding tools requires careful consideration of data collection practices. At Unit 221B, we believe in empowering developers and organizations to make informed decisions about the tools they use. Understanding these data flows is crucial for managing sensitive data and maintaining control over development environments. Evaluating the convenience of AI tools against data handling practices is important, and the choice should be an informed one.
This analysis documents the application behavior observed through technical analysis of network traffic conducted in March 2025. The observed behavior and configurations may change in future software updates.
Unit 221B will continue monitoring the evolution of AI coding assistant tools and their security implications as part of our ongoing threat intelligence work. The balance between AI capability and data collection practices represents one of the key security considerations for development teams in 2025.
This analysis is based on technical analysis conducted by our research team, but several external sources provide supporting context:
For organizations interested in further research on telemetry patterns in AI coding tools, we recommend conducting network traffic analysis with appropriate tools to observe telemetry endpoints and data collection patterns through direct observation.
At Unit 221B, we believe in providing security professionals and developers with technical insights needed to make informed technology decisions. Understanding the data collection capabilities and communication patterns in modern AI development tools is essential when evaluating their appropriate use, particularly in environments with specific security or privacy requirements. This analysis aims to contribute to that understanding by documenting observable telemetry mechanisms in popular development tools.