SWE-gen 进度看板

最后更新时间:2026-05-13 13:49:42 BJT | 下次刷新:2026-05-13 14:49:42 BJT | 刷新间隔:3600 秒

收集 PR 总数
423,933
1h 0 / 24h +1,612
有效 SWE 总数
28,751
1h +13 / 24h +886
整体处理成功率
6.8%
Valid SWE / Collected PRs
difficulty_score 均值
5.76
median 5.7,count 28,704

语言进度

语言收集 PR过去 1h过去 24h有效 SWE过去 1h过去 24h处理成功率
Cc28,2020+777,7340+335
27.4%
C++cpp43,0660+1601,895+9+62
4.4%
Gogo87,6080+914,447+1+119
5.1%
Javajava59,8890+422,609+2+30
4.4%
JavaScriptjs30,7860+613,6660+7
11.9%
Pythonpy68,0830+3092,51800
3.7%
Rustrust50,4360+502,5000+65
5.0%
TypeScriptts55,8630+8223,382+1+268
6.1%

fix.patch 复杂度

语言Valid SWE CountAvg fix.patch linesAvg fix.patch hunksAvg fix.patch files
C7,734281.2215.404.93
C++1,895332.2011.424.53
Go4,447271.2614.934.97
Java2,609171.5410.924.46
JavaScript3,66673.316.222.76
Python2,518135.789.993.44
Rust2,500257.2412.864.06
TypeScript3,382159.889.104.09

统计方法说明

难度打分 difficulty_score

读取每个有效任务目录的 solution/fix.patchtests/instruction.md,由 src/swegen/scoring.py 使用零 API 静态评分。

当前公式采用 log-scale 连续评分,避免中等规模 patch 过早变成 hard。权重为:patch_scope 38%logic_complexity 32%context_breadth 15%test_complexity 10%instruction_complexity 5%

label 阈值:easy <= 4.0medium <= 7.0hard > 7.0

Tags 生成与展示

tags 不是看板现场计算的,而是在 swegen 构建任务时由 LLM 根据 PR 信息生成,并写入 task.toml[metadata].tags

prompt 要求 tags 按三段式生成:编程语言、项目层级/领域、框架/库名或具体主题。看板只读取已有 task.toml 并统计每个语言的 tag 出现次数和占比。

fix.patch 统计

patch 统计来自每个有效任务的 solution/fix.patch,并按语言扩展名过滤代码文件,口径与 upload_march_swe_to_hf.py 的 code-only 统计保持一致。

Avg fix.patch lines 统计代码文件 diff 中新增/删除行数;Avg fix.patch hunks 统计 @@ hunk 数;Avg fix.patch files 统计涉及的代码文件数。

difficulty_label 分布

语言easy / medium / hardeasymediumhard
C
735 / 5189 / 1802
7355,1891,802
C++
321 / 1181 / 389
3211,181389
Go
370 / 3276 / 795
3703,276795
Java
347 / 1663 / 596
3471,663596
JavaScript
593 / 2724 / 348
5932,724348
Python
211 / 1724 / 560
2111,724560
Rust
260 / 1464 / 774
2601,464774
TypeScript
345 / 2562 / 475
3452,562475

difficulty_score 概览

语言countminp25medianmeanp75max
C7,7262.44.95.95.917.09.2
C++1,8912.54.45.65.636.89.0
Go4,4412.64.95.85.816.79.1
Java2,6062.84.75.85.806.99.2
JavaScript3,6652.64.45.25.286.19.2
Python2,4952.64.95.85.906.98.9
Rust2,4982.74.96.26.107.49.0
TypeScript3,3822.74.65.55.586.58.9

全局 Top Tags

library12,989 (45.2%)
backend8,838 (30.8%)
cli4,016 (14.0%)
frontend1,701 (5.9%)
testing1,229 (4.3%)
react917 (3.2%)
http902 (3.1%)
framework776 (2.7%)
embedded567 (2.0%)
cpp396 (1.4%)
networking359 (1.3%)
async325 (1.1%)
kubernetes258 (0.9%)
graphql229 (0.8%)
postgresql226 (0.8%)
eslint217 (0.8%)
parsing207 (0.7%)
aws182 (0.6%)
angular174 (0.6%)
kernel173 (0.6%)
compiler172 (0.6%)
firmware170 (0.6%)
quic166 (0.6%)
git165 (0.6%)
json162 (0.6%)
redis154 (0.5%)
aem147 (0.5%)
security142 (0.5%)
rust141 (0.5%)
tls138 (0.5%)

每语言 Tags 分布

C c

library4,080 (52.8%)
backend1,884 (24.4%)
cli938 (12.1%)
embedded562 (7.3%)
cpp395 (5.1%)
networking287 (3.7%)
testing213 (2.8%)
postgresql186 (2.4%)
kernel173 (2.2%)
framework172 (2.2%)
firmware170 (2.2%)
quic161 (2.1%)
http146 (1.9%)
rust131 (1.7%)
bluetooth121 (1.6%)
tls119 (1.5%)
ruby113 (1.5%)
scheduler110 (1.4%)
cryptography102 (1.3%)
python88 (1.1%)

C++ cpp

library1,392 (73.6%)
backend308 (16.3%)
testing239 (12.6%)
cli151 (8.0%)
boost92 (4.9%)
framework76 (4.0%)
http69 (3.6%)
async62 (3.3%)
compiler41 (2.2%)
parsing38 (2.0%)
serialization36 (1.9%)
templates29 (1.5%)
actor-framework28 (1.5%)
arrayfire28 (1.5%)
formatting27 (1.4%)
redis22 (1.2%)
logging21 (1.1%)
audio20 (1.1%)
sparql20 (1.1%)
iceberg19 (1.0%)

Go go

backend2,356 (53.1%)
cli1,197 (27.0%)
library992 (22.3%)
http337 (7.6%)
kubernetes226 (5.1%)
testing137 (3.1%)
docker98 (2.2%)
aws80 (1.8%)
framework71 (1.6%)
grpc70 (1.6%)
aws-sdk56 (1.3%)
git45 (1.0%)
dns45 (1.0%)
database44 (1.0%)
aws-lambda38 (0.9%)
security37 (0.8%)
terraform36 (0.8%)
blockchain36 (0.8%)
prometheus34 (0.8%)
templ34 (0.8%)

Java java

backend1,301 (49.9%)
library1,167 (44.8%)
aem147 (5.6%)
testing132 (5.1%)
spring104 (4.0%)
framework77 (3.0%)
http74 (2.8%)
android72 (2.8%)
json63 (2.4%)
sling57 (2.2%)
flink47 (1.8%)
cli40 (1.5%)
concurrency40 (1.5%)
grpc37 (1.4%)
nacos34 (1.3%)
mybatis33 (1.3%)
maven29 (1.1%)
parsing27 (1.0%)
sql-parser27 (1.0%)
websocket27 (1.0%)

JavaScript js

library1,797 (49.0%)
backend762 (20.8%)
frontend486 (13.3%)
cli435 (11.9%)
testing204 (5.6%)
react180 (4.9%)
eslint178 (4.9%)
framework176 (4.8%)
fastify118 (3.2%)
http105 (2.9%)
mongoose97 (2.6%)
webpack71 (1.9%)
typescript69 (1.9%)
express67 (1.8%)
svelte61 (1.7%)
lighthouse61 (1.7%)
aframe53 (1.4%)
async50 (1.4%)
apostrophecms48 (1.3%)
nodejs47 (1.3%)

Python py

backend1,095 (43.9%)
library916 (36.7%)
cli418 (16.7%)
ansible98 (3.9%)
fastapi78 (3.1%)
testing77 (3.1%)
framework66 (2.6%)
aiohttp62 (2.5%)
aws56 (2.2%)
django54 (2.2%)
http52 (2.1%)
click46 (1.8%)
dbt41 (1.6%)
black40 (1.6%)
async39 (1.6%)
openai39 (1.6%)
jinja235 (1.4%)
pipx34 (1.4%)
pytorch31 (1.2%)
litellm31 (1.2%)

Rust rust

library1,409 (56.4%)
cli578 (23.1%)
backend453 (18.1%)
testing166 (6.6%)
http92 (3.7%)
git88 (3.5%)
async77 (3.1%)
compiler59 (2.4%)
parsing55 (2.2%)
graphql55 (2.2%)
datafusion53 (2.1%)
substrate48 (1.9%)
actix-web45 (1.8%)
macros45 (1.8%)
sql44 (1.8%)
framework41 (1.6%)
parquet39 (1.6%)
blockchain31 (1.2%)
sqlparser28 (1.1%)
clap28 (1.1%)

TypeScript ts

library1,236 (36.5%)
frontend1,078 (31.9%)
react733 (21.7%)
backend679 (20.1%)
cli259 (7.7%)
angular173 (5.1%)
graphql133 (3.9%)
framework97 (2.9%)
electron94 (2.8%)
fullstack86 (2.5%)
testing61 (1.8%)
github-actions59 (1.7%)
vue55 (1.6%)
express48 (1.4%)
javascript45 (1.3%)
nextjs42 (1.2%)
eslint39 (1.2%)
xstate37 (1.1%)
zod36 (1.1%)
react-native36 (1.1%)