[ExecuTorch][WebGPU] linear_q4gsw test suite: Llama-1B shapes + 4k/8k sweep#20227
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20227
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review |
|
Claude finished @JulianCloudNTH's task in 1m 32s —— View job Code Review: PR #20227 —
|
… sweep Pull Request resolved: #20227 Adds the numerical test suite for `et_vk.linear_q4gsw` (stacked on the op diff), mirroring the SDPA test suite. A named CONFIGS sweep covers real Llama-3.2-1B linear shapes — q/o-proj (2048->2048), k/v-proj (2048->512), gate/up-proj (2048->8192), down-proj (8192->2048), lm_head (2048->128256) — plus 4k/8k large-token prefill (M=4096/8192 on the 2048->2048 and 2048->512 projections). `test/ops/quantized_linear/test_quantized_linear.py` exports each config's `.pte` + an fp64 dequant-matmul "truth" golden; `test/test_webgpu_native.cpp` reconstructs the deterministic ramp input bit-for-bit, runs the op on the GPU, and compares per element; `scripts/test_webgpu_native_ci.sh` wires the fixtures into the Dawn(Tint)+SwiftShader CI. ghstack-source-id: 392908895 @exported-using-ghexport Differential Revision: [D108314849](https://our.internmc.facebook.com/intern/diff/D108314849/)
SS-JIA
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
7b5567a
into
gh/JulianCloudNTH/24/base
… sweep Pull Request resolved: #20227 Adds the numerical test suite for `et_vk.linear_q4gsw` (stacked on the op diff), mirroring the SDPA test suite. A named CONFIGS sweep covers real Llama-3.2-1B linear shapes — q/o-proj (2048->2048), k/v-proj (2048->512), gate/up-proj (2048->8192), down-proj (8192->2048), lm_head (2048->128256) — plus 4k/8k large-token prefill (M=4096/8192 on the 2048->2048 and 2048->512 projections). `test/ops/quantized_linear/test_quantized_linear.py` exports each config's `.pte` + an fp64 dequant-matmul "truth" golden; `test/test_webgpu_native.cpp` reconstructs the deterministic ramp input bit-for-bit, runs the op on the GPU, and compares per element; `scripts/test_webgpu_native_ci.sh` wires the fixtures into the Dawn(Tint)+SwiftShader CI. ghstack-source-id: 392908895 @exported-using-ghexport Differential Revision: [D108314849](https://our.internmc.facebook.com/intern/diff/D108314849/)
Stack from ghstack (oldest at bottom):
Adds the numerical test suite for
et_vk.linear_q4gsw(stacked on the op diff), mirroring the SDPA test suite. A named CONFIGS sweep covers real Llama-3.2-1B linear shapes — q/o-proj (2048->2048), k/v-proj (2048->512), gate/up-proj (2048->8192), down-proj (8192->2048), lm_head (2048->128256) — plus 4k/8k large-token prefill (M=4096/8192 on the 2048->2048 and 2048->512 projections).test/ops/quantized_linear/test_quantized_linear.pyexports each config's.pte+ an fp64 dequant-matmul "truth" golden;test/test_webgpu_native.cppreconstructs the deterministic ramp input bit-for-bit, runs the op on the GPU, and compares per element;scripts/test_webgpu_native_ci.shwires the fixtures into the Dawn(Tint)+SwiftShader CI.@exported-using-ghexport
Differential Revision: D108314849
Differential Revision: D108314849