Mesa 23.1.0 Release Notes / 2023-05-10¶
Mesa 23.1.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 23.1.1.
Mesa 23.1.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.
Mesa 23.1.0 implements the Vulkan 1.3 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.
SHA256 checksum¶
a9dde3c76571c4806245a05bda1cceee347c3267127e9e549e4f4e225d92e992 mesa-23.1.0.tar.xz
New features¶
VK_EXT_pipeline_library_group_handles on RADV
VK_EXT_image_sliced_view_of_3d on RADV/GFX10+
VK_KHR_map_memory2 on ANV and RADV
fullyCoveredFragmentShaderInputVariable on RADV/GFX9+
VK_EXT_discard_rectangles version 2 on RADV
VK_EXT_graphics_pipeline_library on RADV
extendedDynamicState3ColorBlendEquation on RADV
primitiveUnderestimation on RADV/GFX9+
VK_KHR_fragment_shading_rate on RADV/GFX11
VK_EXT_mesh_shader on RADV/GFX11
RGP support on RADV/GFX11
GL_NV_alpha_to_coverage_dither_control on r600/evergreen+
Bug fixes¶
[radeonsi] flickering debug chunk border lines in Minecraft
radv, radeonsi: Rogue Legacy 2 alpha-to-coverage rendering issues
[r600, TURKS] R600: Unsupported instruction: vec1 32 ssa_1 = intrinsic image_samples (ssa_0) on spec@arb_shader_texture_image_samples@compiler@fs-image-samples.frag (23.1.0-rc4)
vulkan/device_select: no way to select between GPUs of the same model due to bugs
Intel/anv: Modifier problems running gamescope embedded
radv: 7900 XTX hair flickering/rendering issues in VaM
radv: cache crashing
nouveau: Regression in arb_transform_feedback_overflow_query-basic from multithreading patches
radeonsi: vaapi: `width >2880 && width % 64 != 0` results in wrong width in h265 stream
[regression] iris: unable to use driver as secondary GPU (primary AMD GPU)
iris: steam doesn’t render on dg2
[llvm 16+] [microsoft-clc] opencl-c-base.h does not exist
Vulkancts clipping / tesselation tests trigger gpu hang on DG2
Swaped fields in picture in vlc and mythtv if hw accel is on
WGL: Assert assigns dwThreadId variable
nine regression with r600 (bisected)
[ACO] [RADV] Flickering squares in some areas in The Last of Us Part 1 (with workaround)
radv: Jedi Fallen Order flickering & blocky plants
nouveau: NV50 (NVAC) broken in latest master
rusticl failed to build with rust-bindgen 0.65.0
Regression, Bisected: glsl: Delete the lower_tess_level pass breaks r600 tesselation
vkcts-navi21-valve failing often with GCVM_L2_PROTECTION_FAULT_STATUS:0x00X00830
Deep Rock Galactic GPU freeze (AMD, DX11 DXVK Proton)
radv: Resident Evil 4 Chainsaw Demo GPU hang with Navi 24
radv: Gotham Knights GPU hang with Navi 24
aco: s_load_dword with negative soffset cause GPU hang
piglit.spec.ext_image_dma_buf_import.ext_image_dma_buf_import crash shutting down
overlay layer: unable to launch titles on steam
radv/zink: spec@ext_texture_integer@multisample-fast-clear gl_ext_texture_integer
VAAPI: Wrong H.264 playback on RX 6900 XT and RX 6700 XT (all Sienna?)
radv: possibly not setting state dirty bits correctly
RADV: VRS attachment not working in specific scenario
rusticl: invalid SPIR-V kernel causes panic
[RADV] The Last Of Us Part 1: artifacting in the menu (with workaround)
AMD va-api outputs corrupt encoding
!20673 regressed `dEQP-VK.wsi.xlib.surface.query_formats`
aco: missing dependency on generated header
zink: spirv validation errors with spirv 1.6
freedreno/a6xx: Assertion `view->rsc_seqno == rsc->seqno’ failed.
iris regression in map stride after import with gen9 parts
anv: zink ADL failures
Vulkan loader `vk_common_GetPhysicalDeviceFormatProperties` fails to sanitize properties bits.
Loading a model in PrusaSlicer 2.6.0-alpha5 crashes GNOME on radeonsi
[glx][bisected][regression]Intel HD 3000 failing to create context on applications like Unity
v3d: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rg32f_cube and similar fails when run together with other tests
standalone glsl compiler not getting built with mingw
!22191 broke test-docs-mr
mesa: index buffer leaking
RadeonSI: null dereference in amdgpu_cs_add_buffer, potential refcount mismatch, running BeyondAllReason
NIR can’t unroll any loop from nine
Steel Division 2 - radv/gpu hang - bisected
Turnip: VKD3D can’t run due to lack of memory property flag
eglCreateImageKHR, error: EGL_BAD_ALLOC (0x3003), message: “createImageFromDmaBufs failed” on AMD multi-gpu with explicit format modifiers
radv: In the game Quake II RTX appeared artifacts at fresh mesa builds
radv: Vampire: The Masquerade - Bloodline (Unofficial Patch) regression
radeonsi broken for gcn1 card
libgrl.a installed but not used?
radv: crash compiling UE5 lumen hardware RT shader
spec@ext_transform_feedback@builtin-varyings gl_culldistance fail
Panfrost T860 - broken system with latest mesa on gnome wayland jammy
aco: unused vtmp_in_loop
FTBFS: src/amd/llvm/ac_llvm_util.c:248:4: error: implicit declaration of function ‘LLVMAddIPSCCPPass’ (LLVM C interface removed upstream)
vulkan: new generated physical_device_feature missing meson dependency
Build broken on old-ish Python versions
radv: Support fullyCoveredFragmentShaderInputVariable from VK_EXT_conservative_rasterization on RDNA2+
radv,nir: dEQP-VK.ray_query.builtin.rayqueryterminate.* failures
RFE: Use _mesa_is_foo(ctx) helpers more
spec@ext_transform_feedback@builtin-varyings gl_culldistance fail
ci: infinite XDG_RUNTIME_DIR spam
ci: XDG_RUNTIME_DIR spam
[KBL] iris failures with dEQP-GLES3.functional.texture.compressed.astc.void_extent*
glsl compiled error when the RHS of operator `>>` is int64_t by enabling GL_ARB_gpu_shader_int64 extension
turnip: inline uniforms regression
QPainter fails to render multiple shapes with a brush set since Mesa 23.0
eglSwapBuffers blocks in wayland when it’s wl_surface_frame event is stolen.
plasmashell sometimes hangs with mesa_glthread
pps_device.h:23:11: error: ‘uint32_t’ does not name a type
Build fails with llvm 17: llvm/ADT/Triple.h: No such file or directory
nir: i2f32(i2i32(x@8)) isn’t being collapsed to i2f32(x)
zink-lvp no longer running tests
radv: Immortals Fenyx Rising: Grass Flicker on R9 380X and Steam Deck
radv: A Plague Tale: Requiem black “flash” on 7900XTX
7900 XTX: Graphical corruption / artifacts in Cyberpunk
radeonsi draws spurious values to depth buffer
Commit ccaaf8fe04c956d9f16f98b7f7fa69a2526283bc causes GPU ring timeouts on BONAIRE
radv: CmdCopyQueryPoolResults broken for VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT with queryCount > 1
rusticl over llvmpipe + ffmpeg’s Opencl filter = error -51
rusticl over llvmpipe + ffmpeg’s Opencl filter = error -51
ci: Remove LAVA Gitlab section handler workaround
System freeze when playing some h264 videos with VA-API on Rembrandt
OpenGL crashes in X-Plane 11
r600/TURKS: NIR Shader related errors on CLI with the game “A Hat in Time” and Gallium Nine
agx: shifts aren’t sound
ci: build logs hidden
fatal error: intel/dev/intel_wa.h: No such file or directory
[Bisected] Regression: Project Zomboid renders black
hasvk: Black pixels with 8xMSAA and fast clears on Intel(R) HD Graphics 4400 (HSW GT2)
radv: GTA IV graphical artifacts on 7900XTX
radv: Resident Evil Revelations 2 artifacts on 7900XTX with DCC
radv: Prototype 2 black textures on RDNA 3 when DCC is enabled
Mesa 23.0.0 crashes immediately with indirect rendering
virpipe-on-gl: arb_enhanced_layouts@matching_fp64_types crashes
[RADV] Returnal - pistol muzzle flash fills whole screen (graphical artifact)
ACO: dEQP-VK.binding_model.descriptor_buffer.multiple.graphics_geom_buffers1_sets3_imm_samplers hangs on NAVI10
Build failures with recent lld
r600,regression: Glitches on terrain with the NIR backend on Transport Fever 2
[radeonsi] Regression with MSAA fix for Unreal / Unreal Tournament 99
spirv: Switch Vulkan drivers to use `deref_buffer_array_length`
r600/TURKS: Crash of the game “A Hat in Time” with Gallium Nine and NIR path (third report)
[gen9atom] Vulkan tests cause gpu hang: dEQP-VK.memory_model.*
GL_SHADER_BINARY_FORMAT_SPIR_V is not added to the list of GL_SHADER_BINARY_FORMATS even if GL_ARB_gl_spirv is supported.
mesa: “Fragmented” dynamic lights in IronWail with `r_fsaamode 1` on
[ANV/DG2] Vertex explosion in nvpro-samples/vk_raytracing_tutorial_KHR/ray_tracing_gltf
CUEtools FLACCL hit assert in rusticl
Assertion Failed on Intel HD 5500 with Linux / Mesa 22.3.1 / OpenGL
Rise of the Tomb Raider’s Ambient Occlusion pass misrenders (swimming shadows)
vk_enum_to_str: missing VkPipelineCreateFlags
[glsl] [spirv] ssbo unsizied array not supported ?
Creating a vulkan physical device on an AMD GPU causes following calls to drmModeAddFB to fail with ENOENT
Minecraft: spec related compile errors
mesa: _mesa_glthread_upload crash
glthread: OpenGL submission blocks while swapping buffers
glthread: Loading a shader cache in yuzu slows down with mesa_glthread=true
Commit “”radeonsi: enable glthread by default”” (d6fabe49cd72fb) causes a regression in gstreamer gtkglsink element
llvmpipe: linear rasterizer / depth bug
radv: (Using mesh shader) NIR validation failed after nir_lower_io_to_scalar_early
panfrost Mali-G31 glamor regression
allwinner a64: DRM_IOCTL_MODE_CREATE_DUMB failed: Cannot allocate memory after some time of apps usage
turnip: dEQP-VK.ubo.random.all_shared_buffer.48 slow
wine + dxvk + Rise of the Tomb Raider crashes in Soviet Installation 20% with VK_ERROR_DEVICE_LOST
Sometimes VLC player process gets stuck in memory after closure if video output used is Auto or OpenGL
ci: Remove LAVA Gitlab section handler workaround
kwin_wayland crashes involving dri2_create_drawable when Plasma starts and the llvmpipe driver from Mesa 23.0-rc3 and 23.0-rc4 is used
turnip: no ubwc fast clear for depth on a618
anv: VK_ACCESS_2_SHADER_READ_BIT doesn’t seem to be handled correctly
Vulkan WSI flags leak into NIR, breaking build on BSDs
Iris corruptions in zoom calls
Sampling with aux enabled with ISL_AUX_STATE_PASS_THROUGH seems broken on Tigerlake+
anv: incorrect task shader payload
radv: Hi-Fi Rush incorrectly rendering face shadows with DCC on 7900 XTX
[iris] isl_calc_min_row_pitch seems incorrect on a750
DG2: incorrect rendering in Sascha Willems raytracing callable demo
turnip: conditional load/store hurts some workloads
Some blackouts / rendering issues with RADV_PERFTEST=gpl in Battlefield 1 (DX11)
radv/zink: ACO assert with DOOM2016
Registered special XGE not unregistered
draw_llvm.c:788:7: error: implicit declaration of function ‘LLVMContextSetOpaquePointers’
asahi: Optimize lower_resinfo for cube maps
Metro Exodus hits nir validation with a driver supporting raytracing.
ANV Gen 9.5 swapchain corruption when using newer `VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL` layout
turnip: optimal bin layout
piglit.spec.arb_shader_texture_image_samples.compiler.fs-image-samples_frag regression
lavapipe assert fails on Windows
zink: itoral-gl-terrain-demo rendering failure
asahi: Implement a shader disk cache
[ICL] Trine 4 trace causing GPU HANG
radv: Segfault during createAccelerationStructure when the backing buffer is not bound to memory
7900 XTX stuck at ‘compiling shaders’ in Monster Hunter Rise
radv: slow GPL (fast) link times
libEGL warning: failed to get driver name for fd -1
iris: Context incorrectly marked as guilty
simple_mtx.h:34:12: fatal error: valgrind.h: No such file or directory
[ANV] Commit 4ceaed78 causes misrendering on Cyberpunk 2077
adding eglGetMscRateANGLE support for multiple monitors with different refresh rates
Performance regression in Chromium WebGL when implement ANGLE_sync_control_rate with egl/x11
ci: Ensure that the Intel/Freedreno trace pipelines only show up in relevant MR’s
anv: Performance issue with Vulkan on Wayland KWin
Incorrect format conversion on big endian
radv: State of Decay 2 character rendering regression
aco_tests assembler.gfx11.vop12c_v128/gfx11 failure
r600,regression: Loading of DOOM stuck at 0% with the NIR backend
RADV: enabling TC-compat HTILE in GENERAL for compute queues is likely broken
Confidential issue #8065
VAAPI HEVC encode broken since 22.3
GPU HANG: ecode 12:1:859ffffb (Resetting rcs0 for stopped heartbeat on rcs0) - reproducible
zink: src/gallium/auxiliary/pipebuffer/pb_slab.c:138: Assertion failed: `heap < slabs->num_heaps`
[zink] Assertion `heap < slabs->num_heaps’ failed on Pascal (bisected)
[RADV] Incorrect copies to/from compressed textures with mipmaps
mesa_glthread=true and probably ANY id Tech 3 engine games, offroad…
radeonsi: VRAM Leak/abnormally high usage in Minecraft mod pack
nir/lower_blend: Bogus assert
anv-tgl-vk: fails a multiple jobs after changing sharding
radv CTS crashes since ebec42d799b22b7b3d06acd710f5687252446a06
llvmpipe: dEQP-EGL programs.link failures.
libmesa_util depends on gallium
EGL report EGL_EXT_create_context_robustness with kms_dri drvier while can’t create context with EGL_LOSE_CONTEXT_ON_RESET_EXT attribute.
v3d: missing drm format modifier support on Raspberry Pi 4 required for mpv
Return To Monkey Island black screen
Return To Monkey Island black screen
navi22 amdgpu: bo 000000002843d677 va 0x0800000400-0x08000005ff conflict with 0x0800000400-0x0800000600
Ryzen 6800H laptop amdgpu: bo 00000000b1eb583a va 0x0800000200-0x08000003ff conflict with 0x0800000200-0x0800000400
[RADV] [MISSED PERFORMANCE POTENTIAL] Vulkan not working when Color Depth is set to “16”, but Vulkan works when Color Depth is set to “24”
v3dv: f2f16_rtz lowering could be improved
debug build compilation failed: inlining failed in call to ‘always_inline’ ‘src_is_ssa’: indirect function call with a yet undetermined callee
radv: regression: broken UI rendering in Elden Ring
radv: Missing implementation of VkImageSwapchainCreateInfoKHR and VkBindImageMemorySwapchainInfoKHR
Changes¶
Adam Jackson (22):
glx/dri3: Simplify protocol version tracking
glx: Remove glx_context::screen
glx: Remove a can’t-happen NULL check
glx: Remove support for glXGetDriverConfig for old drivers
glx: Clean up some funny business from context bind/unbind
glx: Reflow MakeContextCurrent a little
glx: Check for initial “glX” first in glXGetProcAddress
glx: Move 1.2 GLXPixmap code into glx_pbuffer.c
glx: Inline a few single-use constant strings into their user
glx: Fix drawable type inference in visual/fbconfig setup
glx: Harmonize glXCreateGLXPixmap with glXCreatePixmap
mesa: Fix extension table formatting
mesa: Trivially advertise NV_generate_mipmap_sRGB
wsi/x11: Make get_sorted_vk_formats handle varying channel widths
wsi/x11: Infer the default surface format from the root window’s visual
wsi/x11: Support depth 16 visuals
glx/dri: Use X/GLX error codes for our create_context_attribs
dri: Validate more of the context version in validate_context_version
glx/dri: Fix error generation for invalid GLX_RENDER_TYPE
glx: Disable the indirect fallback in CreateContextAttribs
glx: Fix error handling yet again in CreateContextAttribs
mesa: Enable NV_texture_barrier in GLES2+
Adam Stylinski (2):
glx: fix a macro being invoked with the wrong parameter name
mesa: fix out of bounds stack access on big endian
Alan Coopersmith (1):
util/disk_cache: Handle OS’es without d_type in struct dirent
Alejandro Piñeiro (17):
vulkan/wsi: check if image info was already freed
v3dv/format: remove unused v3dv_get_tex_return_size
v3dv/pipeline: rename lower_tex_src_to_offset to lower_tex_src
v3dv: pass alignment to v3dv_buffer_init
v3dv/image: use 64-byte alingment for linear images if needed
v3dv: skip two ycbcr tests
broadcom/compiler: v3d_nir_lower_txf_ms doesn’t need v3d_compile
broadcom/compiler: treat PIPE_FORMAT_NONE as 32-bit formats for output type
v3dv: enable shaderStorageImageReadWithoutFormat
broadcom/compiler: fix indentation at v3d_nir_lower_image_load_store
nir: track if var copies lowering was called
radv: use shader_info->var_copies_lowered
anv: use shader_info->var_copies_lowered
v3d/v3dv: use shader_info->var_copies_lowered
v3dv: handle ASPECT_MEMORY_PLANE aspect flags when getting plane number
v3dv/debug: add debug option to disable TFU codepaths
v3dv/pipeline: use pipeline depth bias enabled to fill up CFG packet
Alexandros Frantzis (2):
egl/wayland: Fix destruction of event queue with proxies still attached.
vulkan/wsi/wayland: Fix destruction of event queue with proxies still attached.
Alyssa Rosenzweig (351):
nir/peephole_select: Allow load_preamble
agx: Peephole select after opt_preamble
asahi: Handle sampler->compare_mode
panfrost: Don’t use AFBC of sRGB luminance-alpha
pan/bi: Fix incorrect compilation of fsat(reg.yx)
pan/bi: Add a unit test for fsat(reg.yx)
panfrost: Enable NV_primitive_restart on Valhall
panfrost: Fix logic ops on Bifrost
panfrost: Stop testing CAP_INT16
panfrost: Remove PAN_MESA_DEBUG=deqp
panfrost: Remove unused debug parameter
panfrost: Fix clears with conditional rendering
panfrost: Document render_condition_check contract
nir: Add Midgard-specific fsin/fcos ops
nir: Optimize vendored sin/cos the same way
pan/mdg: Use special NIR ops for trig scaling
pan/mdg: Scalarize LUT instructions in NIR
pan/mdg: Remove MSGS debug
mesa: Set info.separate_shader for ARB programs
nir/lower_blend: Fix alpha=1 for RGBX format
nir/lower_blend: Clamp blend factors
nir/lower_blend: Fix SNORM logic ops
nir/lower_blend: Avoid useless iand with logic ops
nir/lower_blend: Don’t do logic ops on pure float
nir/lower_blend: Handle undefs in stores
nir/lower_blend: No-op nir_color_mask if no mask
asahi: Omit extra call to clock_gettime
nir/opt_preamble: Treat *size as an input
nir/opt_preamble: Consider load_preamble as movable
agx: Lower system values in NIR in the driver
agx: Bump preamble_storage_size to 512
agx: Centralize texture lowering
asahi: Use non-UAPI specific BO create flags
nir: Add a late texcoord replacement pass
asahi: Run nir_lower_fragcolor during preprocessing
asahi: Lower texcoords late
panfrost: Implement GL_EXT_render_snorm on Bifrost+
ail: Add layout->mipmapped_z input
ail: Test mipmapped_z behaviour
ail: Test 63x63 cube map
asahi: Set layout->mipmapped_z for 3D textures
asahi: Fix encoding of uniform size
asahi: Strengthen agx_usc_uniform contract
asahi/nir_lower_sysvals: Split large ranges
asahi: Correct alignment for USC Uniform packets
agx: Support uniform registers as LODs
asahi: Use writeback when it looks beneficial
asahi: Make STAGING resources linear
asahi: Prefer blit-based texture transfer
asahi: Implement nontrivial rasterizer discard
asahi: DRY dirty tracking conditions
asahi: Remove redundant tri merge disable bit
asahi: Merge fragment control XML
agx: Keep varyings forwarded to texture as fp32
asahi: Don’t use 16-bit inputs to 32-bit st_tile
docs/asahi: Document clip distance varyings
agx: Fix storing to varying arrays
agx: Handle constant-offset in address matching
asahi: Add XML for custom border colours
agx/decode: Add a data parameter to stateful
agx/decode: Handle extended samplers
asahi: Implement custom border colours
asahi: Fix delete_vs_state implementation
asahi: Add compute kernel scaffolding
asahi: Don’t leak shader NIR
asahi: Add hooks for SSBO and images
asahi: Fake more caps for dEQP-GLES31
asahi: Advertise seamless cube maps
asahi: Stub out MSAA for dEQP
asahi: Bump PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS
asahi: Add compute batches
asahi: Implement load_ssbo_address/get_ssbo_size
asahi: Identify more compute-related XML
agx: Implement compute ID intrinsics
agx: Implement barriers
nir/print: Extract get_location_str
nir/print: Pretty-print I/O semantic locations
nir/print: Pretty-print color0/1_interp
agx: Allow uniform sources on phis
agx: Run DCE twice
agx: Lower uniform sources with a dedicated pass
agx: Don’t scalarize preambles in NIR
nir/lower_clip: Only emit 1 discard
tu,vulkan: Add common Get*OpaqueCaptureDescriptorDataEXT
radv: Use common Get*OpaqueCaptureDescriptorDataEXT
agx: Remove unused AGX_MAX_VARYINGS
agx: Respect component in frag load_input
agx: Fix AGX_MAX_CF_BINDINGS
agx: Remove bogus gl_Position assertion
agx: Implement load_helper_invocation
agx: Write sample mask even with no colour output
asahi: Submit batches that don’t touch RTs
asahi: Add XML for indirect dispatch
asahi: Add XML for indirect draws
asahi: Add XML for VDM memory barriers
panvk: Take lock when tracing
panvk: Fix varying linking
panvk: Disable SNORM rendering
asahi: Remove default=true on index list values
asahi: Refactor index buffer upload for indirect
asahi: Implement indirect draws
panfrost: Fix some fields in v10.xml
pan/decode: Add support for decoding CSF
asahi: Vectorize background colour load
panfrost: Disable CRC by default
panfrost: Fix prim restart XML on Valhall
nir: Augment raw_output_pan with IO_SEMANTICS+BASE
pan/lower_framebuffer: Operate on lowered I/O
nir/lower_blend: Don’t touch store->dest
nir/lower_blend: Don’t handle gl_FragColor
nir/lower_blend,agx,panfrost: Use lowered I/O
asahi: Lower clip distances late
asahi: Move agx_preprocess_nir to CSO create
agx: Don’t treat clip distances specially
agx: Do more work in agx_preprocess_nir
asahi: Fix rendering into mipmapped framebuffers
agx: Lower offsets in NIR
agx: Model and pack gathers
agx: Implement gathers (nir_texop_tg4)
docs/features: Sync Asahi with reality
asahi: Advertise ARB_derivative_control
asahi: Advertise ARB_texture_barrier
agx: Model atomic instructions
agx: Model local loads/stores
agx: Disallow immediate bases to device_load
agx: Pack global atomics
agx: Pack local load/store instructions
agx: Translate NIR atomics
agx: Translate load/store_shared
agx: Lower shared memory offsets to 16-bit
agx: Pack local atomics
agx: Implement b2b32
agx: Handle group_memory_barrier
agx: Add and use agx_nir_ssa_index helper
agx: Handle ssa_undef as zero
agx: Add agx_internal_format_supports_mask helper
asahi: Implement color masks with masked stores
asahi: Make shader-db work again
panfrost: Use proper locations in blend shaders
nir/lower_blend: Consume dual stores
nir: Add nir_texop_lod_bias_agx
asahi: Lower lod_bias_agx to uniform registers
agx: Lower sampler LOD bias
nir/lower_blend: Don’t dereference null
docs/feature: Mark ARB_sync as done on Asahi
asahi/decode: Handle VDM barriers
nir: Add nir_lower_helper_writes pass
pan/mdg: Use nir_lower_helper_writes
asahi: Advertise dual-source blending
agx: Mask shifts in the backend
agx: Fix 2D MSAA array texture register allocation
asahi: Mark PIPE_FORMAT_NONE “supported”
agx: Don’t write sample mask from preambles
agx: Add AGX_MESA_DEBUG=nopreamble option
agx: Clean up after lowering address arithmetic
agx: Factor out allows_16bit_immediate check
agx: Inline 16-bit load/store offsets
agx: Constify agx_print
agx: Refactor vector creation
agx: Use agx_emit_collect for st_tile
agx: Don’t print pre-optimization shader
agx: Only lower int64 late
asahi: Bump shader buffers
asahi/meta: Use lowered I/O
agx: Disable tri merging with side effects
agx: Handle fragment shader side effects
asahi: Rework system value lowering
asahi: Wire up compute kernels
nir/lower_tex: Add lower_index_to_offset
pan/bi: Use lower_index_to_offset
ir3: Use lower_index_to_offset
nir/opt_barrier: Generalize to control barriers
glsl/nir: Use scoped_barrier for control barrier
pan/bi: Drop control_barrier handling
pan/mdg: Drop control_barrier handling
ir3: Drop non-scoped barrier handling
gallivm: Drop non-scoped barrier handling
agx/lower_address: Break on match
agx/lower_address: Optimize “shift + constant”
agx/lower_address: Handle large shifts
agx/lower_address: Handle 8-bit load/store
agx/lower_address: Fix handling of 64-bit immediates
agx/lower_address: Handle 16-bit offsets
agx: Assert that memory index is 32-bit reg
agx: Fix clang-formatting
agx: Pack indirect texture/sampler handles
agx: Handle indirect texture/samplers
asahi: Don’t allow linear depth/stencil buffers
asahi, agx: Implement dummy samplers
asahi,agx: Implement buffer textures with gnarly NIR
panfrost: Remove some unused definitions
docs/panfrost: Move description of instancing
panfrost: Don’t use DECODE_FIXED16 for sample position
panfrost: Handle fixed-point packing in GenXML
panfrost: Add XML for framebuffer pointers
panfrost: Use framebuffer pointer XML
panfrost: Remove FBD tag enum from XML
panfrost: Inline the last MALI_POSITIVE use
panfrost: Remove MALI_POSITIVE macro
pan/mdg: Remove reference to removed macro
agx: Don’t set lower_pack_split
agx: Make partial DCE optional
agx: Fix subdivision coalescing
agx: Implement extract_[ui]16
agx: Use nir_lower_mem_access_bit_sizes
agx: Switch to scoped_barrier
nir/lower_point_size: Use shader_instructions_pass
ail: Restructure generated tests
agx: Lower discard late
util/prim_convert: Don’t set index_bounds_valid
pan/bi: Ignore signedness in vertex fetch
panfrost: Identify “Base vertex offset” signedness
panfrost: Assert that we don’t see unsupported vertex formats
panfrost: Defeature 24-bit textures
panfrost: Handle null textures robustly
panfrost/ci: Skip draw_buffers_indexed.random.* on Midgard
panfrost/ci: Identify some Piglit flakes
panfrost/ci: Add some Piglit skips
panfrost/ci: Remove fbo-mrt-new-bind fail+flake
panfrost: Note glDrawRangeElements underflow
asahi: Fix occlusion query lifetime
panfrost: Don’t round up Midgard polygon list BOs
panvk: Use vk_get_physical_device_features
asahi: Use a dynarray for writers
ci: Add clang-format to the amd64 container
ci: Enforce clang-format for asahi
gallium: Fix u_stream_outputs_for_vertices with QUADS
nir/builder: Add nir_umod_imm helper
blorp,anv,hasvk: Use umod_imm
v3d,v3dv: Use udiv_imm/umod_imm
radv: Use umod_imm
ir3: Use umod_imm
nir: Add Panfrost intrinsics to lower sample mask
nir: Add Mali load_output taking converison
panfrost: Use 0/~0 boolean for MSAA sysval
pan/bi: Don’t duplicate texture op cases
pan/bi: Lower sample mask writes in NIR
pan/bi: Lower load_output to make sysval explicit
pan/bi: Allow specializing bifrost_nir_options by arch
pan/bi: Lower gl_VertexID in NIR
pan/bi: Remove bi_load_sysval
pan/mdg: Use I/O semantics for MRT blend stores
panfrost: Remove inputs->blend.rt
panfrost: Remove unused inputs.nr_cbufs
pan/bi: Only lower once
pan/mdg: Only lower once
pan/bi: Split out early preprocessing from late
pan/mdg: Split out early preprocessing from late
pan/lower_framebuffer: Only call for FS
pan/lower_framebuffer: Use nir_shader_instructions_pass
pan/blit: Lower load_sampler_lod_parameters_pan
panfrost: Preprocess shaders in the driver
pan/lower_framebuffer: Lower MSAA blend shaders
panfrost: Lower clip_fs late
panfrost: Lower texcoords late
panfrost: Effectively lower gl_FragColor late
panfrost: Preprocess shaders at CSO create time
panfrost: Remove stale TODO
panvk: Lower sysvals in NIR
panvk: Don’t use vec4 for vertex_instance_offsets
panvk: Inline blend constants as syvals
panfrost: Add NIR-based sysval lowering pass
panfrost: Lower sysvals in GL
panfrost: Move sysvals to GL driver struct
panvk: Remove unused function
panfrost: Move panfrost_sysvals to GL driver
pan/bi: Export bifrost_nir_lower_load_output
pan/bi: Call pan_nir_lower_zs_store late
panvk: Lower blending late
panfrost: Remove Midgard RSD fields from Bifrost
asahi: Convert to SPDX headers
mesa/st: Only set seamless for GLES3
mesa/st: Normalize wrap modes for seamless cubes
asahi: Don’t lie about seamless cube maps
panfrost: Print perf debug when flushing everything
panfrost: Print perf debug on seqnum overflow
panfrost: Don’t redundantly call emit_const_buf
panfrost: Mark packs as ALWAYS_INLINE
panfrost: Don’t update access with a single batch
panfrost: Add a v9 fast path for no images
panfrost: Clean up tiler calculations
panfrost: Estimate vertex count for hier mask
panfrost: Choose hierarchy masks by vertex count
docs: Remove docs about macOS hardware drivers
nv50,nvc0: Use u_pipe_screen_get_param_defaults
panfrost: Always upload a workaround sampler
pan/{mdg,bi}: Always use sampler 0 for txf
panfrost: Unset TEXTURE_BUFFER_SAMPLERS
gallium: Remove PIPE_CAP_TEXTURE_BUFFER_SAMPLER
docs/gallium: Note samplers are not used for txf
nir/print: Don’t print sampler_index for txf
asahi: Support more renderable formats
agx: DCE even with noopt
agx: Assert that we don’t overflow registers
agx: Constify agx_{read,write}_registers
agx: Don’t allow uniform source to local_atomic
agx: Don’t destroy usub_sat with constant
asahi: Add perf debug for generate_mipmap
asahi: Add perf debug for shader variants
agx: Set loads_varying accurately
agx: Add helper for calculating occupancy
asahi/decode: Remove agxdecode_dump_bo
asahi/decode: Print VDM barriers
asahi: Set PIPE_CAP_LOAD_CONSTBUF
agx: Coalesce more collects
agx: Don’t overallocate registers
asahi: Honour sampler count
asahi: Implement null textures
asahi: Lower 1D to 2D
asahi: Dirty track depth bias uploads
asahi: Clamp texture buffer sizes
agx: Tease apart some sample_mask packing magic
agx: Rename writeout to wait_pix
agx: Make signal_pix instructions explicit
vulkan: Add common features2_to_features
radv: Use vk_features2_to_features
v3dv: Use vk_features2_to_features
lavapipe: Use vk_features2_to_features
pvr: Use vk_features2_to_features
anv,hasvk: Use vk_features2_to_features
tu: Use vk_features2_to_features
nir: Combine if_uses with instruction uses
nir/opt_ray_queries: Don’t use list_length
nir/opt_loop_unroll: Avoid list_length
nir: Remove 2nd argument from nir_before_src
nir/validate: Don’t treat if-uses specially
dxil: Avoid list_length
nir: Reduce indirection
nir: Factor out nir_src_rewrite_ssa helper
nir: Use nir_src_rewrite_ssa
dxil: Use nir_src_rewrite_ssa
nir: Remove nir_if_rewrite_condition_ssa
nir/repair_ssa: Refactor some use handling
nir/validate: Only walk uses once
mailmap: Update my e-mail
panfrost: Symlink gallium .clang-format to common
panfrost/winsys: Add .clang-format for winsys folder
panfrost/winsys: Clang-format
pan/decode: Move comment out of designated initializer
panfrost: Re-run clang-format
panvk: Clang-format
ci: Run clang-format on panfrost
mesa/st: Set uses_sample_shading when forcing per-sample
nir/lower_blend: Set uses_fbfetch_output conservatively
nir/lower_blend: Enable per-sample shading
pan/bi: Lower swizzles for 8-bit CSEL
pan/bi: Respect swizzles for more vector ops
pan/bi: Use nir_lower_mem_access_bit_sizes
panfrost: Allocate shared memory in OpenCL
pan/decode: Print compute job payloads
asahi: Fix disk cache disable with AGX_MESA_DEBUG
Amber (15):
util/u_trace: pass utrace context to marker functions.
freedreno: add support for markers.
ir3, isaspec: add raw instruction to assembler/disassembler.
ir3: support texture and sampler index with offsets
nir: support lowering nir_intrinsic_image_samples to a constant load
ir3: use lower_image_samples_to_one
intel/compiler: use lower_image_samples_to_one
freedreno: make sure depth/stencil layouts are always tiled
freedreno: use A6XX_GRAS_SC_CNTL_SINGLE_PRIM_MODE with fb readback
gallium: make BlendCoherent usable from gallium drivers
freedreno: use blendcoherent to set FLUSH_PER_OVERLAP
freedreno: check for conditional rendering in launch_grid
nir: allow nir_lower_fb_read to support multiple render targets
nir: Add memory coherency information to shaders.
freedreno, nir, ir3: implement GL_EXT_shader_framebuffer_fetch
Andres Calderon Jaramillo (1):
r600: Report multi-plane formats as unsupported
André Almeida (2):
radv: Implement vk.check_status
winsys/amdgpu: Fix amdgpu_cs_query_reset_state2 error log
Antonio Gomes (11):
rusticl: Enabling reading/writing for images created from buffers
rusticl: Enabling image fill for images created from buffers
rusticl: Enable copy for images created from buffers
rusticl: Enable mapImage for images created from buffers
gallium, rusticl: Add tex2d_from_buf in image_view and sampler_view
mesa/st, nine, nouveau: Fix uninitialized pipe_sampler_view structs
lvmpipe/cs: Add support for 2d images created from buffers
gallium: Add new caps PIPE_CAP_LINEAR_IMAGE_(PITCH_ALIGNMENT|BASE_ADDRESS_ALIGNMENT)
rusticl: Implement spec for cl_khr_image2d_from_buffer
llvmpipe: Add new caps PIPE_CAP_LINEAR_IMAGE_(PITCH_ALIGNMENT|BASE_ADDRESS_ALIGNMENT)
iris: Add support for 2d images created from buffers
Anuj Phogat (3):
anv: implement TES distribution mode WA 22012785325
iris: implement TES distribution mode WA 22012785325
intel/genxml/125: Add preferred SLM allocation size field
Asahi Lina (43):
asahi: Split off common BO code into its own file
asahi: Split off macOS support into its own file
asahi: Refuse to transfer out-of-bounds mip levels
meson: Fix Asahi build on macOS
asahi: Fix shader key cloning overreads
asahi: Do not use memctx for pools / meta cache
asahi: Drop agx_device.memctx
asahi: Only apply FS lowerings to fragment shaders
asahi: Add BO_SHAREABLE flag
asahi: Add readonly BO flag
asahi: Identify USC cache invalidate
asahi: Flush USC caches on the first draw
asahi: Drop macOS backend
asahi: Add nocluster,sync,stats debug flags
asahi: Align device submission API with upcoming UAPI
asahi: Implement Linux driver scaffolding, sans UAPI
asahi: Add APIs for DMA-BUF sync file import/export
asahi: Add agx_debug_fault() helper
asahi: Add result buffer to context/batches
asahi: Add agx_bo_mmap() calls to transfer path
asahi: Pull device name from device struct
asahi: Do not overread user index buffers
asahi: Fix scissor culling check when out of bounds for FB/viewport
asahi: Fix device fd leak in agx_close_device
asahi: Destroy the renderonly context on screen destroy
asahi: clang-format the world again
asahi: Assert on TIB strides > 64
asahi: Support importing sync objects on BO export
asahi: Make agx_flush_resource reallocate non-shareable resources
asahi: Extend batch tracking for explicit sync
Revert “asahi: Advertise dual-source blending”
asahi: Make agx_alloc_staging() take a screen instead of a context
asahi: Enable glthread
asahi: Locate low VA BOs correctly
asahi: Fix style nits
asahi: Implement valid buffer range tracking
asahi: Make BO import path failures more robust
asahi: Add a helper macro for debug/error messages
asahi: Add resource debugging
asahi: Print reasons why compression is disabled
asahi: Fix compressed ZS support
asahi: Flip kmsro around to allocate on the GPU
asahi: Allow explicit non-LINEAR modifiers for scanout
Axel Davy (1):
frontend/nine: Fix num_textures count
Bas Nieuwenhuizen (26):
aco: Pass correct number of coords to Vega 1D LOD instruction.
radv: Strictly limit alignment needed within a descriptor set.
radv: Reduce descriptor pool allocation for alignment.
radv: Set FDCC_CONTROL SAMPLE_MASK_TRACKER_WATERMARK
radv: Shift left the tile swizzle more on GFX11.
nir: Apply a maximum stack depth to avoid stack overflows.
radv: Add helper to hash stages.
radv: Hash group handles as part of RT pipeline key.
radv: Use provided handles for switch cases in RT shaders.
radv: Use group handles based on shader hashes.
radv: Implement & expose VK_EXT_pipeline_library_group_handles.
Update my mailmap aliases
ac/surface,radv: Avoid pitch weirdness if image not used for rendertarget.
ac/surface: Only allow stencil pitch adjustment for mipmaps.
ac/surface,radv: Opt out of stencil adjust.
util: Add aligned int64_t types for x86(non 64).
util/disk_cache: Align atomic size.
radv: Align atomic values.
radv: Reserve space in framebuffer emission.
radv: Reserve space in various streamout functions.
radv: Reserve space in conditional rendering functions.
radv: Reserve space in si_cs_emit_cache_flush.
radv: Reserve space for updating DCC metadata.
radv: Reserve space for fast clear related writes.
radv: Reserve space for indirect descriptor set address writes.
radv: Move all the dirty flags from TES binding to TCS binding.
Benjamin Cheng (1):
radv: initialize cmd_buffer upload list earlier
Boyuan Zhang (6):
radeonsi/vcn: check fence before destroying dpb
radeonsi/vcn: check fence before destroying decoder
radeonsi/vcn: validate fence handle before using it
virgl/video: disable decoder fence
virgl: add more formats to conv table
frontends/va: check decoder in va surface call
Brian Paul (9):
anv: add a third memory type for LLC configuration
llvmpipe: do additional checks in lp_state_fs_analysis.c for linear shaders
llvmpipe: remove debug printf spam in lp_setup_wait_empty_scene()
gallium/xlib: call fence_finish() in XMesaSwapBuffers()
llvmpipe: fix ps invocations query bug
llvmpipe: rename some vars related to occlusion query and ps invocations
llvmpipe: s/tabs/spaces/
llvmpipe: s/unsigned/enum pipe_query_type/
llvmpipe: clean-up llvmpipe_get_query_result()
Błażej Szczygieł (1):
glx: Fix glXGetFBConfigFromVisualSGIX
Caio Oliveira (26):
glsl: Account for unsized arrays in NIR linker
hasvk: Update driver name in debug information
intel: Add extra zeros at the end of debug identifiers
iris, crocus: Align workaround address to 32B
anv, hasvk: Align workaround address to 32B
nir: Add nir_intrinsic_rotate
nir/lower_subgroups: Add option lower_rotate_to_shuffle
spirv: Implement SPV_KHR_subgroup_rotate
nir: Support use_scoped_barrier in nir_lower_atomics_to_ssbo
microsoft/compiler: Handle scoped barrier in Tess splitting
gallivm: Fix handling of nir_intrinsic_scoped_barrier
glsl: Implement use_scoped_barrier option for lowering memory barriers
intel/compiler: Mark various memory barriers intrinsics unreachable
pan/compiler: Fix handling of nir_intrinsic_scoped_barrier
pan/midgard: Handle nir_intrinsic_scoped_barrier in Midgard compiler
panfrost: Use NIR scoped barriers instead of memory barriers
spirv: Don’t specify nir_var_uniform or nir_var_mem_ubo in barriers
spirv/tests: Subclass spirv_test helper to namespace the tests
spirv/tests: Add script to generate C array from SPIR-V source
spirv/tests: Parametrize stage in get_nir() helper
spirv/tests: Add some basic control flow tests
spirv: Add skip_os_break_in_debug_build option to use in unit tests
intel/fs: Handle scoped barriers with execution scope
intel/vec4: Handle scoped barriers with execution scope
intel/compiler: Drop brw_nir_lower_scoped_barriers
intel/compiler: Drop non-scoped barrier handling
Caleb Cornett (6):
d3d12: Lower minimum supported Shader Model to 6.0
futex: Change INT_MAX to INT32_MAX.
util: Add #ifdefs for Xbox GDK support.
dxil_validator: Add support for Xbox GDK.
wgl: Add support for Xbox GDK.
d3d12: Add support for Xbox GDK.
Charlie Birks (1):
docs: add a few vulkan extensions supported by multiple drivers
Charmaine Lee (5):
svga: fix resource_get_handle from resource created without SHARED bind flag
svga: fix compatible formats for shareable surfaces
svga: use upload buffer if texture has pending changes
translate: do not clamp element index in generic_run
svga: set PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY for VGPU10 device
Chia-I Wu (38):
turnip: replace TU_DEBUG_DONT_CARE_AS_LOAD by a bool
turnip: make debug_flags a global variable
freedreno: add has_implicit_modifier helper
freedreno: support UBWC scanout
turnip: add a comment to tu_format_for_aspect
turnip: move a comment about FMT6_Z24_UNORM_S8_UINT_AS_R8G8B8A8
turnip: remove tu_native_format::tile_mode
turnip: make tu6_format_*_supported static
turnip: let tu6_format_vtx* take pipe format
turnip: add blit_format_texture
turnip: add blit_format_color and blit_base_format
turnip: handle ubwc in blit_base_format
turnip: reorder tu6_format_*
freedreno/registers: correct WFM bit in CP_REG_TEST
turnip: add a comment to tu_render_pass_cond_config
turnip: skip unnecessary CP_REG_TEST for cond load/store
freedreno/registers: document more bits of CP_REG_TEST
freedreno: avoid conditional ib in fd6_emit_tile
radv: fix a hang with binning on CHIP_RENOIR
turnip: fix a major leak with GPL LTO
turnip: fix a null descriptor set dereference
turnip: avoid FMT6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 for event blits
radv: add a size check in radv_create_buffer for Android
util/log: refactor mesa_log
util/log: allow multiple loggers
util/log: improve logger_file newline handling
util/log: improve logger_android
util/log: add logger_syslog
util/log: add support for MESA_LOG_FILE
util/log: add logger_windbg
mesa: add missing newlines for _mesa_debug/_mesa_log callers
mesa: use mesa_log from output_if_debug
anv: process utrace payloads on queue submission
ci/radv: remove dEQP-VK.image.sample_texture.* fails/flakes
radv: set RADEON_FLAG_GTT_WC for external mem on vram
radv: rework radv_layout_fmask_compressed
radv: add RADV_FMASK_COMPRESSION_PARTIAL
radv: disable tc_compatible_cmask on GFX9 in some cases
Christian Gmeiner (1):
etnaviv: nir: use lower_fround_even
Collabora’s Gfx CI Team (3):
Uprev Piglit to 60e7f0586bac0cfcfcb5871046e31ca2057a5117
Uprev Piglit to 2391a83d1639a7ab7bbea02853b922878687b0e5
Uprev Piglit to 355ad6bcb2cb3d9e030b7c6eef2b076b0dfb4d63
Connor Abbott (12):
freedreno/a6xx: Rename CP_CSQ_IB*_STAT
freedreno/a6xx: Add CP_ROQ_*_STAT
freedreno/a6xx: Fix CP_ROQ_THRESHOLDS_1
freedreno/a6xx: Fill in ROQ status registers
freedreno/crashdec: Fix apparent off-by-one with ROQ size
freedreno/crashdec: Add prefetch test
tu: Fix tile_align_h on a650
freedreno: Fix or/and’ing two BitmaskEnums
tu: Use vk_pipeline_get_renderpass_flags()
vk/render_pass: Support VK_EXT_fragment_density_map
nir, spirv: Add support for VK_EXT_fragment_density_map
tu: Don’t override depth for GMEM
Constantine Shablya (12):
anv: handle ATTACHMENT_OPTIMAL layout
anv: use Vulkan runtime’s robust buffer access
hasvk: use Vulkan runtime’s robust buffer access
anv,hasvk: flush what UNIFORM_READ flushes on SHADER_READ
vulkan: relocate rmv to its correct home
vulkan: tidy up vk_physical_device_features
vulkan: delete trailing namespace
vulkan: add hepler for vkGetPhysicalDeviceFeatures2
vulkan: use vk_features for vk_device::enabled_features
anv: use vk_get_physical_device_features
vulkan: fix building with python3.8
vulkan: depend idep_vulkan_runtime_headers on vk_physical_device_features.h
Corentin Noël (12):
ci/venus: Remove failure now passing
kopper: Do not free the given screen in initScreen implementation
ci: uprev virglrenderer
ci/venus: Skip tests risking out of memory issues
ci: uprev virglrenderer and crosvm
ci: Setup XDG_RUNTIME_DIR in crosvm-init
ci: Allow to use crosvm-runner before deqp-runner
ci: Uprev crosvm and virglrenderer
venus/ci: Only run one crosvm instance
mesa: OpenGL ES 3.0 requires EXT_instanced_arrays
glapi: Make EXT_draw_instanced functions available for GLES 2.0
mesa: Add EXT_instanced_arrays support
Daniel Schürmann (82):
radv: CSE ray_launch_{size|id}
radv: rename shader_info->cs.uses_sbt -> shader_info->cs.is_rt_shader
radv: unconditionally enable scratch for RT shaders
radv/rt: introduce and set rt_pipeline->stack_size
radv/rt: use dynamic_callable_stack_base also for static stack_sizes
radv/rt: don’t hash maxPipelineRayRecursionDepth
nir: add Continue Construct to nir_loop
nir: add assertions that loops don’t have a Continue Construct
nir: create nir_push_continue() and related helpers
nir: add lowering for Loop Continue Constructs
spirv: use Loop Continue Construct to emit SPIR-V loops and lower after parsing
nir/lower_continue_constructs: special-case Continue Constructs with zero or one predecessors
nir/lower_continue_targets: only repair SSA when necessary
nir: simplify nir_block_cf_tree_{next|prev}
radv/rt: rename library_pipeline->groups to library_pipeline->group_infos
radv/rt: defer library_pipeline allocation
radv/rt: introduce struct radv_ray_tracing_module
radv/rt: move stack_sizes into radv_ray_tracing_module
radv/rt: only reserve stack_sizes after rt_case insertion
radv: expose radv_postprocess_nir()
radv: expose radv_pipeline_capture_shaders()
radv/rt: introduce and use radv_rt_pipeline_compile()
radv: remove unused parameters from radv_compute_pipeline_compile()
radv/rt: move radv_pipeline_key from rt_variables to traversal_data
nir/gather_info: allow terminate() in non-PS
aco: fix NIR infinite loops
radv/rt: use terminate() when returning from raygen shaders
aco/dominance: set immediate dominator for any BB without predecessors
aco/value_numbering: clear hashmap between disconnected CFGs
aco/dead_code_analysis: don’t add artificial uses to p_startpgm
aco/insert_exec_mask: allow for disconnected CFG
aco/spill: allow for disconnected CFG
radv/rt: place any-hit scratch vars after intersection scratch vars
radv/rt: Fix any_hit scratch variables.
mesa: add gl_shader_stage_is_rt()
radv: add RT shader args
radv: handle RT stages in radv_nir_shader_info_pass()
radv: add RT stages to radv_get_shader_name()
radv: add RT shader handling to radv_postprocess_config
aco: add RT stage enums
aco: don’t set private_segment_buffer/scratch_offset on GFX9+
aco: move rt_dynamic_callable_stack_base_amd to VGPR
aco: implement load_ray_launch_{id|size}
aco: create hw_init_scratch() function for p_init_scratch lowering
aco: implement select_rt_prolog()
radv: add radv_create_rt_prolog()
radv: compile rt_prolog
radv/rt: use prolog for raytracing shaders
aco: remove aco::rt_stack variable
radv: remove unused parameter from radv_open_rtld_binary()
radv: separate radv_postprocess_binary_config() from radv_shader_create()
radv: remove unnecessary copy of binary->config
radv: inline radv_postprocess_config()
radv: separate radv_capture_shader_executable_info() from radv_shader_create()
radv: move gl_shader_stage from radv_binary to radv_shader_info
radv: remove radv_create_gs_copy_shader()
radv: refactor shader_compile()
radv: skip pipeline caching with RADV_DEBUG=shaders
radv: fix radv_shader_binary member fields to 32 bit.
radv/rt: Fix VK_KHR_pipeline_executable_properties
aco: split ps_epilog args before exporting them
aco/ra: adjust_max_used_regs() for fixed Operands
aco: don’t use shared VGPRs for shaders consisting of multiple binaries
radv: update PS num_vgprs in case of epilogs rather than overallocating VGPRs
vulkan/pipeline_cache: remove vk_device from vk_pipeline_cache_object
vulkan/pipeline_cache: Don’t re-insert disk-cache hits into disk-cache
vulkan/pipeline_cache: implement vk_pipeline_cache_create_and_insert_object()
vulkan/pipeline_cache: use vk_pipeline_cache_create_and_insert_object() during vk_pipeline_cache_load()
vulkan/pipeline_cache: add cache parameter to deserialize() function
vulkan/pipeline_cache: move vk_log on failed deserialization to vk_pipeline_cache_load()
radv: derive struct radv_shader from vk_pipeline_cache_object
radv: unconditionally store the binary code in radv_shader
radv: add radv_shader_serialize() and radv_shader_deserialize() functions
radv: add struct radv_pipeline_cache_object
radv: implement radv_shader_create_cached()
radv: use vk_pipeline_cache
radv: clean up pipeline-cache interface
radv/ci: add 2 more Flakes for Navi21
radv/rt: fix total stack size computation
radv/rt: properly destroy radv_ray_tracing_lib_pipeline on error
vulkan/pipeline_cache: replace raw data objects on cache insertion of real objects
radv: add padding to radv_shader_binary_legacy
Daniel Stone (18):
ci/fdno: Only run full tests on a limited subset of machines
ci/radv: Skip vkCreateInstance memory-fail test
ci/anv: Temporarily halve TGL testing load
intel/isl: Don’t scream FINISHME into logs for 3D vs. CCS
ci/radv: Drop raven quick_shader load
ci/fdno: Add a618 Vulkan flakes
ci/zink: Add flake seen in the wild
ci/radv: Lower stoney CTS load
ci/android: Use a more aggressive timeout for the job
ci: Actually run Piglit on LAVA
ci: Disable Collabora LAVA farm
Revert “ci: Disable Collabora LAVA farm”
CI: Disable Windows runners
CI: Disable mingw job
ci/panfrost: Add texturesize flake seen in the wild
CI: Disable freedreno
ci/radeonsi: sort and dedup stoney skips
ci/radeonsi: Skip really slow tests on stoney
Danylo Piliaiev (52):
tu/kgsl: do not use kgsl_command_object::offset
tu: Prevent using stale value of RB_UNKNOWN_88D0 on BLIT
tu: Prevent using stale value of GRAS_SC_CNTL in sysmem clear
freedreno: Document A6XX_GRAS_SC_CNTL::rotation field
turnip: Ensure that there is no renderpass rotation in binning
turnip: Disable draw states after dyn renderpass in all cases
ir3: Consider dst type in ubo_vec4 to ldc lowering
tu: Don’t expose KHR_present_id,KHR_present_wait without KHR_swapchain
turnip: Add debug option to find usage of stale reg values
docs/freedreno: Add info about stale reg stomper dbg option
ci/tu: Add 1/200 pass to test for stale reg usage
ir3: Add cat5/cat7 cache related instructions
ir3: Add cat7 sleep instruction
freedreno/register: Define chip enum values
util/perf: C++-proof util/perf
util/format: Make format_table compatible with C++
spirv: sort spirv_supported_capabilities
vk/vk_extension_gen: Make table struct initializable in C++ on older gcc
vk/wsi: C++-proof wsi_common_drm.h
vk/util: remove (void *) casts from vk_foreach_multi_draw macros
vk/util: Generate defines to help casting structs with vk_find_struct
freedreno/common: C++-proof freedreno_uuid.h
ir3: C++-proofing
tu: C++-proofing: fix offsetof with dynamic array index
tu: C++-proofing: fix struct initializers
tu: C++-proofing: various enum fixes
tu: C++-proof: do not goto over variables initialization
tu: C++-proofing: fix designator initializer order
tu: C++-proofing: fix extension table initialization
tu: C++-proofing: Initialize tu_reg_value in-order by pack funcs
tu: C++-proofing: fix casting from void * fpermissive warnings
tu: C++-proofing: ease access to global bo struct
tu: C++-proofing: prevent taking address from rvalue
tu: C++-proofing: cast result when extracting field from reg value
tu: C++-proofing: misc fixes
freedreno/msm: Rename drm_msm_gem_submit_reloc::or in C++ code
tu: compile as C++
vk/entry_points: Add option to generate template entrypoints
freedreno/regs: Include assert.h in generated headers
tu: Generate entrypoints for each gen
turnip: add cached and cached-coherent memory types
tu/drm: Support cached non-coherent memory
freedreno/registers: Document new CP_EVENT_WRITE::SEQNO
freedreno/registers: More a7xx regs
freedreno/computerator: C++ proofing
freedreno: C++ fixes for computerator to compile
freedreno/computerator: Convert to C++
freedreno: Move fd6_pack.h to common code accessible by computerator
freedreno: Add dummy a730/a740 definition
freedreno/computerator: Templatize a6xx backend
freedreno/computerator: Add support for a7xx
vulkan: Sanitize pSampleMask in CmdSetSampleMaskEXT
Dave Airlie (37):
ci: bump vk cts to 1.3.3.1 + and a crash fix.
vulkan/video: add common h264/h265 parameter set management code.
vulkan/format: add a 10-bit video format
radv: remove the status query mark it unsupported.
radv: add new upload alloc aligned api
ac: add name to codec info struct
radv: adding video decode queue support
radv: add video decoder register setup.
radv/video: add initial frameworking.
radv/video: add initial h264 decoder for VCN
radv/video: add h264 support for uvd
radv: add vcn h265 decode.
radv/video: add h265 decode UVD support
radv/vcn: enable dynamic dpb tier 2 for h264/h265 on navi21+
anv: add video engine support in various places
anv: set Y/4 tiling for video decode images
anv: add video format features for the one supported video output format
anv/format: handle video extensions structs by ignoring them
intel/genxml: align some of the fields with the media driver
intel/genxml: add missing power well control bits
anv/image: allocate some memory for mv storage after video images.
anv: add initial video decode support for h264.
anv/query: add query status report
anv: enable video decode extensions.
anv/video: fix video memory bindings.
crocus: disable Y tiling for render targets properly.
crocus: switch gen4/5 tiling flags to follow suggestions.
llvmpipe: fix compute address bits to return native pointer size.
anv: always pick graphics queue to execute prime blits on.
radv: add video format support to format probing.
anv/video: fix chroma qp to be a integer value.
anv/video: disable picture id reampping.
anv: fix image height for field pictures.
radv/video: fix h264 frame heights when field images are in use
radv/video: fix used for reference flags.
radv/video: fix h265 decoding sizes.
radv/trace: don’t attempt to emit trace on non-graphics/compute queues
David (Ming Qiang) Wu (1):
radeonsi/vcn: add an exception of field case for h264 decoding
David Heidelberg (73):
ci/zink: Penumbra is now fixed.
freedreno/ci: Switch also performance a630 job to manual
ci/anv: add multiple fails uncovered by change of sharding
ci/intel: fully utilize asus-cx9400-volteer
ci/piglit: explicitely define we want GLX tests
ci: migrate from wget to curl
ci/piglit: 2023-01-19 uprev
ci: bump ci-fairy with session support (robust downloads)
ci: Sir trace has small invisible change in rendering
ci: bump Mold to the 1.10.0
ci: uprev piglit (etag md5 checksumming support)
ci/lavapipe: use dxvk for the traces
ci: revert download of git cache to the wget
ci/llvmpipe: add flake timeout for rusticl program@execute@builtin@builtin-float-sincos-1.0
util/process_test: make the error variable static
intel: enable -mfpmath=sse on x86
intel: use c_see2_arg instead of explicit -msse2
ci/freedreno: add flaking KHR-GL45.buffer_storage.map_persistent_dispatch
meson: print c_cpp_args
intel/vulkan: add missing dependency on generated headers
ci/freedreno: add flaking KHR-GL45.buffer_storage.map_persistent_flush
ci/alpine: keep the curl inside the image
ci: alpine: install bash and coreutils for date -d
ci: implement unified sections
ci: make meson build and test uncollapsed
ci: deqp-runner: drop already unused function
ci: Retry, retry, retry… No one likes to trigger Marge more than once.
ci/zink: add skip for the Single-GL46.enhanced_layouts.ssb_member_align_non_power_of_2
ci/lavapipe: add recent occasional flake
ci/freedreno: rare flake KHR-GL45.sample_variables.mask.rgba8i.samples_4.mask_3
crocus/meson: add dependency on libintel_dev also for versioned static libraries
ci/ci_run_n_monitor: while we usually disable many jobs, print them inline
ci: do not exit when an error happens inside the section
ci/lavapipe: fixes typo
ci/zink: fixup the zink-lvp job
ci: disable mesa-swrast runner jobs
ci/lava: implement the priority
ci/weston: before testing, verify that XWayland is really running
ci/weston: add background PID
ci: add and utilize dalboz devices
ci/amd: move skqp and va jobs on raven from XOrg to the XWayland
ci/panfrost: correct the job name, as it runs on gles2
ci/lava: every LAVA job doesn’t want to run gles2 deqp, drop it
ci: build Wayland support for the amd64
ci/iris: update apl and glk expectations, after enabling Wayland support
ci/clover: disable the jobs
ci/traces: disable nheko trace with zink since it flakes
ci/freedreno: add recent occasional flakes
ci/traces: add two skips due to flakes
ci/intel: add dEQP-EGL.functional.wide_color.window_fp16_default_colorspace flake
ci: distribute XDG_RUNTIME_DIR with setup-test-env script
ci: disable weston session timeout for llvmpipe
meson: implement quirk for the compilation under armv7 GCC with LTO
aco: drop leftover variable
ci: bump Alpine to 3.17 (again)
ci/freedreno: do not build tools executables without explicitly enabling them
freedreno/decode: fix possible overflow
ci: rename .lava-test to .lava-test-deqp to describe it correctly
ci: create lava-test without deqp HWCI_TEST_SCRIPT
ci: remove deqp from lava piglit and traces runs
ci/freedreno: split deqp from other jobs
ci/freedreno: define Google farm specific includes
ci/freedreno: Make traces work on LAVA caching proxy
ci/broadcom: test occasionally fails, but typically passes
ci: disable lima farm, currently out-of-space, needs to be fixed
ci: implement sections for cuttlefish
ci/v3d: add flaking spec@ext_framebuffer_blit@fbo-sys-blit
Revert “mesa: Enable NV_texture_barrier in GLES2+”
ci/amd: update device status
ci/amd: raven is currently downgraded to 2 machines only, adapt
ci/amd: add draw.dynamic_rendering flake
ci/freedreno: fix the a530_piglit job and switch to Weston
panvk: clear dangling pointers
David Redondo (1):
egl/wayland: fix oob buffer access during buffer_fds clean up
David Rosca (2):
frontents/va: Use PIPE_USAGE_STAGING for coded buffer
frontends/va: Map VAEncCodedBufferType buffer as PIPE_MAP_READ
Dmitry Baryshkov (2):
freedreno/a5xx: reorder GPMU registers
freedreno/a5xx: add SP clock control register
Dmitry Osipenko (6):
util/cache_test: Unset env vars left after Cache.List test
util/mesa-db: Don’t account header size
util/mesa-db: Support removal of cache entries
util/cache_test: Remove dummy cache entry added by cache_exists()
util/mesa-db: Introduce multipart mesa-db cache
util/disk_cache: Switch to multipart mesa-db cache
Dylan Baker (38):
VERSION: bump to 23.1.0-devel for further development
docs: reset new_features.txt
meson: bump minimum required version to meson 0.59
meson: replace has_exe_wrapper with can_run_host_binaries
meson: replace uses of ExternalProgram.path with .full_path
meson: drop meson < 0.54 workaround
meson: use a feature option for dri3
meson: use a feature option for gallium-vdpau
meson: use a feature option for gallium-va
meson: use a feature option for gallium-xa
meson: use a feature option for shader_cache
meson: use a feature option for shared-glapi
meson: use a feature option for gles1
meson: use a feature option for gles2
meson: use a feature option for gbm
meson: use a feature option for llvm
meson: use a feature option for valgrind
meson: use a feature option for libunwind
meson: use a feature option for lmsensors
meson: use a feature option for power8
meson: use a feature option for xlib-lease
meson: use a feature option for zstd
meson: use a feature option for egl
meson: use a feature option for shared-llvm
meson: Use feature option methods for xmlconfig
meson: remove version checks for < 0.59
meson: use builtin support for reading version from a file
meson: use [] instead of ‘lib for !windows name_prefix
meson: use the same workaround for setting ‘lib’ on windows
meson: combine checks for linker –gc-sections support
util: rzalloc and free hash_table_u64
iris: consider bufmgr creation to have failed if `dup`ing of the fd fails
intel/mi: use 64bit constant for bitshift
intel/dev: create a helper dependency for libintel_dev
docs: Add calendar entries for 23.0 release.
docs: add release notes for 23.0.0
docs: Add sha256 sum for 23.0.0
docs/relnotes: add 23.0.0 to relnotes.rst
Ella Stanforth (1):
v3dv: add support for multi-planar formats, enable YCbCr
Emma Anholt (211):
dri2: Fix exposing robustness with swkms.
ci/llvmpipe: Drop dEQP-EGL.functional.sharing.*.link.7 flakes.
ci/iris: Add known flakes for skqp.
ci/iris: Generalize the 8888_pbuffer EGL known flakes and share with GLK.
ci/zink: Add more blit conversion xfails for a618.
freedreno: Skip CPU/GPU timestamp sync when not supported.
ci/freedreno: Add glx-swap-event-async as a flake.
freedreno/pps: Fix a signed/unsigned complaint.
ci: Enable building the testing drivers with perfetto.
ci: Add some new folks to the restricted-traces access list.
Revert “nouveau/ci: temporary disable gk20a-gles”
ci/virgl: Disable iris traces for now while it’s unstable.
ci: Drop windowoverlap xfails, since it’s always skipped.
ci/zink: Drop xfail for copy-sub-buffer.
ci/zink: Drop glx-swap-copy xfails.
ci/zink: Clear issue #7781 flakes.
ci/freedreno: Switch the piglit job to using a deqp-runner suite.
ci: Move PIGLIT_PLATFORM settings out of the .tomls.
ci/piglit: Add some common piglit skips for Mesa CI’s testing of glx.
ci/piglit: Exclude swapbuffers front-readback tests with PIGLIT_PLATFORM=gbm.
zink: Fatal error if requesting validation and we fail to load the layer.
zink: Add missing Flat decorations on some inputs.
zink: Fix validation failure for maxLod < minLod.
zink: Fix up mismatches of memory model vs addressing model.
zink: Re-emit the SpvBuiltInSampleMask access chain each load.
ci/zink: Add coverage using the vulkan validation layer on lvp.
ci/zink: Update TGL full-run xfails.
ci/zink: Update radv xfails for the recent shadow fixes.
ci/freedreno: Mark max-texture-size as a flake.
ci: Move the performance jobs’ allow_failure:true to the gl rules.
ci: Add manual rules variations to disable irrelevant driver jobs.
freedreno: Don’t sync timestamps while perfetto isn’t running.
ci/zink: Disable Amnesia trace until the linked issue gets fixed.
ci/zink: Move the zink-anv-tgl manual full run to custom manual deps.
ci: Run our manual jobs during the nightly scheduled run.
ci: Fix perf jobs blocking Marge pipelines.
ci: Fix perf job condition.
ci: Drop the itoral-gl-terrain demo from traces.
tu: Mark tiling impossible if we couldn’t lay out gmem in the first place.
turnip: Optimize tile sizes to reduce the number of bins.
tu: Only emit the conditional gmem subpass resolves when gmem is possible.
turnip: Make the tiling-impossible case have an impossible tile layout.
gallivm: Optimize emit_read_invocation’s first-invocation loop.
gallivm: Refactor out a shared “get the first active invocation” loop.
gallivm: Return 0 first_active_invocation when we know that up front.
gallivm: Use cttz instead of a loop for first_active_invocation().
gallivm: Use first active invocation in some image/ssbo accesses.
ci/lvp: Drop the subgroupbroadcast skips.
llvmpipe: Enable LP_DEBUG on normal builds.
gallivm: Enable GALLIVM_DEBUG (mostly) on non-DEBUG builds.
gallivm: Fix the type of array nir_registers.
gallivm: Fix codegen performance for constant-index register array stores.
gallivm: Do the same codegen improvement for constant-index array loads.
ci/swrast: Drop skips for tests whose perf had been fixed.
ci/llvmpipe: Drop skip of InteractionFunctionCalls2.
ci/freedreno: Don’t forget to report flakes on a618, too.
u_trace: Add an interface for checking trace enablement outside a context.
zink: Add tracing of blit operations.
ci: Disable systems in my farm that haven’t recovered.
ci/zink: Update TGL full-run xfails.
ci/freedreno: Disable the a306_piglit_gl job.
ci/freedreno: Update a530 manual-run xfails.
ci/freedreno: Add an xfail for a618 VK full run.
ci/freedreno: Update a3xx piglit_shader xfails.
ci/nouveau: Disable the gm20b jobs entirely.
ci/radv: Update navi21 llvm xfails.
ci/crocus: Update HSW expectations.
ci/freedreno: Update manual-run xfails for a530.
Revert “freedreno/a5xx: Fix clip_mask”
ci/radv: Add a skip for navi21-llvm for a test that consistently timeouts.
ci/etnaviv: Drop stale xfails from gc7000.
ci/etnaviv: Update deqp xfails for gc2000.
egl/kopper: Add assert for no kopper in dri2_copy_region.
egl: Add a note explaining the swapBuffers badness in dri2_x11_copy_buffers().
egl/kopper: Use the kopper private interface for swapBuffers.
egl/kopper: Pass ancillary invalidate flush flags down to gallium.
ci: Add a manual full and 1/10th hasvk CTS runs.
hasvk: Silence conformance warning in CI.
hasvk: Fix SPIR-V warning about TF unsupported on gen7.
anv: Fix gfx8/9 VB range > 32bits workaround detection.
hasvk: Fix gfx8/9 VB range > 32bits workaround detection.
glsl: Drop the (v.x + v.y + v.z + v.w) -> dot(v, 1.0) optimization.
ci/etnaviv: Drop one more gc7000 xfail.
ci/freedreno: Drop a530 piglit_gl coverage.
ci/turnip: Drop the #8219 xfail.
ci/zink+turnip: Disable flaky minetest trace.
ci/hasvk: Add a synchronization flake.
ci: Fix stage of etnaviv manual runs.
ci/zink: Add a glx flake on anv
ci/crocus: Add new tess xfails and a link to the regression bug report.
ci/crocus: Mark unvanquished as flaky.
anv: Skip the RT flush when doing depth-only rendering.
anv: Skip BTI RT flush if we’re doing an op that doesn’t use render targets.
glsl/opt_algebraic: Drop ~~x == x transformation.
glsl/opt_algebraic: Drop log(exp(x)) -> x and exp(log(x)) -> x optimisations.
glsl/opt_algebraic: Drop pow-recognizer.
glsl/opt_algebraic: Drop abs(-x) -> abs(x) and abs(abs(x)) -> abs(x).
glsl/opt_algebraic: Drop -(-x) -> x optimization.
glsl/opt_algebraic: Drop f2i(trunc(x)) -> f2i(x) optimization.
glsl/opt_algebraic: drop fsat(fadd(b2f(x), b2f(y))) -> b2f(ior(x, y)) opt.
glsl/opt_algebraic: Drop shifts of 0 optimizations.
glsl/opt_algebraic: Drop pow optimizations.
glsl/opt_algebraic: Drop rcp optimizations.
glsl/opt_algebraic: Drop and/or/xor optimizations.
glsl/opt_algebraic: Drop fdiv(1,x) -> frcp(x) and fdiv(x,1) -> x optimizations.
glsl/opt_algebraic: Drop add/sub with 0 optimizations.
glsl/opt_algebraic: Drop x + -x -> 0 optimization.
glsl/opt_algebraic: Drop csel(true/false, x, y) optimization.
nir: Add optimization for fdot(x, 0) -> 0.
glsl/opt_algebraic: Drop fdot 0-channel optimizations.
glsl/opt_algebraic: Drop scalar all_eq/any_neq -> eq/neq opt.
glsl/opt_algebraic: Drop the eq/neq add-removal optimization.
glsl/opt_algebraic: Drop no-op pack/unpack optimization.
glsl/opt_algebraic: Drop the flrp/ffma simplifiers.
glsl/opt_algebraic: Drop some fmul simplifications.
nir: Port a floor->truncate algebraic opt pattern from GLSL.
glsl/opt_algebraic: Drop the ftrunc pattern recognizer.
glsl/opt_algebraic: Drop the flrp recognizer.
glsl: Remove unused as_rvalue_to_saturate().
ci: Update traces expectations for gutting glsl opt_algebraic.
panfrost/midgard: Fix handling of csel with a vector constant condition.
panfrost/midgard: Drop redundant arg to emit_explicit_constant.
glsl: Move lower_vector_insert to GLSL-to-NIR.
nir/split_64bit_vec3_and_vec4: Handle 64-bit matrix types.
gallivm: Return 0 for first active invocation when no invocations are active.
gallivm: Use first_active_invocation for ubo/kernel memory loads.
gallivm: Use first_active_invocation for scalar SSBO loads.
gallivm: Add some notes about other invocation_0_must_be_active usages.
ci: Add some xfail updates from VKCTS 1.3.5.0 for the manual jobs.
ci/etnaviv: Drop the dEQP-GLES2.functional.uniform_api.random.94 xfail.
anv+hasvk: Use driconf to disable 16-bit for zink.
zink: Pass the cmdbuf to the end of the marker, too.
Revert “ci: disable mesa-swrast runner jobs”
ci: Re-enable some swrast testing using fd.o’s shared runners for now.
glsl/nir: Include early glsl-to-nir output in NIR_DEBUG=print.
glsl_to_nir: Use a variable’s constant_value if it wasn’t const-propped out.
glsl: Delete constant propagation pass.
glsl: Delete constant folding pass.
glsl: Delete constant-variables pass.
ci: Update trace expectations for GLSL constant prop removal.
ci/zink: Update TGL xfails/flakes based on the last nightly pipelines.
ci/turnip: Extend a630 vk full timeout to 3 hours.
ci/iris: Add skips for slow tests on APL.
turnip: Don’t push inline uniform buffer contents outside constlen.
ci/turnip: Clear out stale xfails.
ci/turnip: Disable dEQP-VK.image.queue_transfer.* for now.
ci/turnip: Move some more of the 1.3.5 new xfails under links.
glsl: Simplify vector constructors from scalars.
glsl/lower_precision: Add a unit test that I thought we might fail at.
glsl/lower_precision: Add a cut-down testcase for #8124
glsl: Set the precisions of builtin function arguments and returns.
glsl: Handle highp promotion of builtin function args in the builtins.
glsl: Set the precision of function return value temporaries.
glsl/lower_precision: Drop most special-casing of builtin arg precision.
glsl: Fix the precision of atomic counter builtin function args.
glsl/lower_precision: Add actual spec quotes for “check_parameters”
nir/lower_mediump: Fix assertion about copy_deref lowering matching.
ci/iris: Update more manual job xfails from the Wayland build change.
ci/crocus: Update expectations from VK CTS 1.3.5.0.
ci/hasvk: Update some xfails from the 8-sample fast clear disable.
ci/etnaviv: Get the gc2000_piglit manual job mostly working.
glsl/standalone: Pull program create/destroy out to a public function.
glsl/standalone: Pull out a helper function for adding GLSL source shaders.
glsl/standalone: Make all standalone contexts have NewProgram set.
glsl: Write a new test for GLSL and NIR mediump lowering.
ci/crocus: Fix 1.3.5.0 xfails.
ci/etnaviv: Polish the gc2000 xfails a bit.
ci/zink: Update the tgl manual run xfails.
gallivm: Skip loads/stores that are definitely outside of compact vars.
nir/lower_sysvals: Add support for un-lowered tess_level_inner/outer.
nir_to_tgsi: Handle stores to compact outputs.
glsl: Delete the lower_tess_level pass.
glsl: Remove the TessLevel lowering special case from xfb.
glsl: Drop dead prototype.
ci/freedreno: Flake KHR-GL45.shader_image_load_store.basic-allTargets-store
ci/broadcom: Skip another texelfetch case.
perfetto: Add a .clang-format for the directory.
intel/perfetto: Drop unused “pipelined” field.
perfetto: Make a MesaRenderpassDataSource with common setup/start/stop.
perfetto: Deduplicate clock sync packet emit from renderstage sources.
perfetto: Move intel’s cmdbuf/queue annotation code to the shared util.
ci/zink: Drop validation exception for leaks at device destroy.
ci/zink: Disable godot-tps-gles3 on a630.
docs: Update Vulkan renderpass docs for !22191
ci: Add missing dependency on doxygen sources for docs-generation jobs.
docs: Claim less functionality for glsl_compiler.
glsl: Move ForceGLSLAbsSqrt handling to glsl-to-nir.
zink: Add mapping for nir_op_ldexp, but disable it for 64-bit’s sake.
glsl: Retire ldexp lowering in favor of the nir lowering flag.
glsl/softfp64: GC the temp vars after we lower them to SSA.
glsl/softfp64: Add fisfinite lowering.
state_tracker: Lower frexp before lowering doubles.
intel: Always call nir_lower_frexp.
ir3: Move turnip’s nir_lower_frexp to the shared compiler.
nouveau: Add missing nir_opt_algebraic_late.
nouveau: Enable frexp lowering in the backend.
zink: Enable nir_lower_frexp.
v3d: Lower frexp in the GL compiler like we do in Vulkan.
agx: Enable nir_lower_frexp.
panfrost/midgard: Enable nir_lower_frexp.
nir_to_tgsi: Always lower frexp_exp/sig.
glsl: Drop frontend lowering of 32-bit frexp.
glsl: Drop PIPE_SHADER_CAP_DFRACEXP_DLDEXP_SUPPORTED.
tgsi: Drop TGSI_OPCODE_DFRACEXP.
ci/zink: Disable a630 portal-2-v2 due to kernel OOMs.
etnaviv: Fix regression from if_uses change.
blob: Don’t valgrind assert for defined memory if we aren’t writing.
util/log: Fix log messages over 1024 characters.
vulkan: Handle alignment failure in the pipeline cache.
vulkan: Actually increment the count of objects in GetPipelineCacheData.
ci/radeonsi: Mark glx-make-current as flaky.
EmperorPenguin18 (1):
v3d: expose more drm formats with SAND128 modifier
Eric Engestrom (172):
bin/ci: add gitlab_gql.py.cache to the .gitignore
mesa/st: drop unused param
ci/bare-metal: add more timestamps to help debugging issues
ci: be explicit about the `meson setup` subcommand
docs: add release notes for 22.3.4
docs/relnotes: add sha256sum for 22.3.4
docs: update calendar for 22.3.4
meson: turn android-libbacktrace into a feature option
v3dv: mark dEQP-VK.api.command_buffers.record_many_draws_secondary_2 as flaky
ci/android: move common config to common job
ci/android: move virgl-specific gpu_mode to virgl-defined variables
ci/android: move virgl-specific fails/flakes/skips lists to virgl-defined variables
ci/android: move virgl-specific deqp suite to virgl-defined variables
ci/android: move virgl-specific so lib name to virgl-defined variables
ci/android: add missing line terminator at the end of the file
docs: add release notes for 22.3.5
docs: update calendar for 22.3.5
panfrost: drop no-longer-needed libglsl
gallium/u_screen.h: add missing stdint.h include
util: avoid calling kcmp on Android
etnaviv: use simple_mtx to avoid breaking windows in the next commit
gallium: move etnaviv screen_lookup_or_create function to common code
freedreno: replace custom code with u_pipe_screen_lookup_or_create()
lima: replace custom code with u_pipe_screen_lookup_or_create()
v3d: use u_pipe_screen_lookup_or_create() to keep track of and reuse screens
vc4: use u_pipe_screen_lookup_or_create() to keep track of and reuse screens
panfrost: use u_pipe_screen_lookup_or_create() to keep track of and reuse screens
asahi: use u_pipe_screen_lookup_or_create() to keep track of and reuse screens
u_pipe_screen_lookup_or_create: avoid re-querying the fd to have a consistent hash key
broadcom/ci: mark test as flaky
vk/util: keep track of extension requirements
vk/runtime: keep track of supported instance extensions
vk/runtime: turn vk.xml extension requirements into asserts
meson: move float64_glsl_file one meson.build up
meson: only build mapi when needed
meson: only build the loader when needed
meson: only build libglsl_util when needed
meson: only build glsl when needed
meson: drop `TODO: opengl`, it’s done
ci: simplify adding & removing deqp patches
ci: remove no-op sed
ci: fix grouping of image tags
ci: bump tags of deqp images
docs: add 23.1 branchpoint & rc dates
meson: make GLX require OpenGL
meson/windows: only build libgl-gdi for desktop gl
meson: allow building GLES without GL
mesa: add _mesa_is_desktop_gl_compat() and _mesa_is_desktop_gl_core() helpers
mesa: make use of the new _mesa_is_desktop_gl_compat() helper
mesa: make use of the new _mesa_is_desktop_gl_core() helper
mesa: make more use of the existing _mesa_is_gles* helpers
mesa: add & use new _mesa_is_gles1() & _mesa_is_gles2() helpers
mesa: make more use of the new _mesa_is_gles1() helper
mesa: make more use of the new _mesa_is_gles2() helper
mesa: optimize out _mesa_is_desktop_gl*() and _mesa_is_gles*() calls when not built
ci: stop watching for changes in removed script
meson: improve formatting of options file
broadcom/ci: refactor a bit
broadcom/ci: fold .vc4-rpi3-piglit:armhf into its only user
broadcom/ci: use deqp-runner to run piglit tests
docs/release-calendar: drop the last 22.2.x, it won’t happen
broadcom/ci: group x11 and wayland variant of the same test failing
broadcom/ci: use weston’s xwayland instead of starting X as well
broadcom/ci: add x11- prefix to x11 EGL tests
broadcom/ci: drop create_pixmap_surface from the fails; it passes now
broadcom/ci: skip buffer_age.no_preserve and swap_buffers_with_damage on wayland
broadcom/ci: add two known failures
broadcom/ci: re-enable egl on wayland
docs: include explicit `setup` in instructions
docs: add release notes for 22.3.6
docs/relnotes: add sha256sum for 22.3.6
docs: update calendar for 22.3.6
v3d: update supertuxkart reference after 1c028a4d5b623e73bdf5
docs: mention the meson summary
docs: mention `meson configure` and drop broken workaround script
meson: reuse vulkan_wsi_list for defining vk_wsi_args
meson: replace vk_wsi_args with dependencies to let meson take care of transitivity
egl: include directly the useful vulkan header, instead of including everything
glx: include directly the useful vulkan header, instead of including everything
gbm: drop unnecessary vulkan dependency
radv: split linker script for android since it requires different symbols
glsl: align definition of _mesa_problem with the one in main/error.h
glapi/meson: drop duplicate line in deps
meson: allow checking for null pointers even if they’re supposed to be non-null
panfrost/ci: add EGL tests
asahi/winsys: add .clang-format
vk: move radv’s linker symbols scripts for use in all drivers
v3dv: add linker script to fix android symbols
tu: add linker script to fix android symbols
anv: add linker script to fix android symbols
vn: add linker script to fix android symbols
android/vk: drop unnecessary symbols
vk: be stricter about symbols check between android and other platforms
v3d/ci: add dEQP-GLES3.functional.texture.specification.teximage2d_pbo.*_cube flakes
osmesa: add exported symbols check
docs: add release notes for 22.3.7
docs/relnotes: add sha256sum for 22.3.7
docs: update calendar for 22.3.7
v3dv/ci: add a test to the known failures
meson: bump minimum version to 0.60
meson: allow feature options to take true/false to mean enabled/disabled
meson: inline gtest_test_protocol now that it’s always ‘gtest’
v3dv: split out broadcom_shader_stage_to_gl() calls to improve readability
ci: take valve farm offline
ci: disable weston session timeout
broadcom/ci: no need to skip the tests that swap buffers anymore
ci/broadcom: move rare failure to the flakes
ci: drop redundant .no_scheduled_pipelines-rules + .core-rules since the latter already includes it
ci/rustfmt: simplify getting all the rust files
ci/rustfmt: print which files are checked
ci: group RESULT logic in a single place
v3dv/ci: fix test name (`,Fail` is not part of the test name)
asahi: replace copies of .clang-format with symlinks
asahi: fix a few typos
v3d: fix `dirty` bitset being too small to accept V3D_DIRTY_SSBO
v3dv: use common GetPhysicalDeviceFeatures
v3dv: reorder features as 1.0, 1.1, 1.2, 1.3
v3dv: use vk_get_physical_device_features
v3d/ci: add another depthstencil-default_fb-drawpixels-* to the flakes
v3d/ci: group dEQP-GLES3.functional.texture.specification.teximage2d_pbo.* flakes and add another one
ci: centralize detection of ccache in link-werror wrapper
ci: add linker wrapper for clang
ci: always use the -Werror wrapper
ci: deduplicate compiler wrappers
ci/docs: start documenting ci_run_n_monitor.py
v3d: add link to issue investigating failure
asahi: change create_renderonly signature to uniformize it
etnaviv: change create_renderonly signature to uniformize it
freedreno: change create_renderonly signature to uniformize it
lima: change create_renderonly signature to uniformize it
panfrost: change create_renderonly signature to uniformize it
v3d: change create_renderonly signature to uniformize it
vc4: change create_renderonly signature to uniformize it
kmsro: uniformize renderonly creation
kmsro: sort drivers alphabetically
ci/broadcom: consolidate vc4-rpi3* jobs into a single vc4-rpi3-gl:armhf
ci/broadcom: consolidate v3d-rpi4* jobs into a single v3d-rpi4-gl:armhf
ci/broadcom: slightly increase coverage of vk tests
vc4/ci: add arm64 failure to flakes as it works on armhf
broadcom/ci: run gl jobs on arm64, just like vk
vc4/ci: add another sync flake
panfrost: assign the correct create_for_resource from the start
Revert “broadcom/ci: run gl jobs on arm64, just like vk”
v3dv/ci: mark known dEQP-VK.wsi.xlib.surface.query_formats failure
ci/rustfmt: make sure to only check each file once
v3d: disable GL_NV_conditional_render
VERSION: bump for 23.1.0-rc1
.pick_status.json: Update to 8ebc5cbe2b828f34b9bfb32c528d3514ead59798
v3dv/ci: drop fixed failure from fails.txt
.pick_status.json: Update to 0d7912d239dac5bf3c8b07f2a6ca467f760d6aa6
.pick_status.json: Update to 543b6ca7c4b00c4bfff5668ba0a0643d565db201
amd: fix buggy usage of unreachable()
compiler: fix buggy usage of unreachable()
pvr: fix buggy usage of unreachable()
vk/util: fix buggy usage of unreachable()
v3d: add flake spec@ext_framebuffer_blit@fbo-sys-sub-blit
VERSION: bump for 23.1.0-rc2
.pick_status.json: Update to 3017d01c9ded9c9fd097b600081b1bbe86e90fb8
.pick_status.json: Update to a18a51a708a86f51e0a5ab031b379f65bc84fb49
.pick_status.json: Update to c060b649c5a866f42e5df73f41c6e2809cf30e99
ci: rework vulkan validation layer build script
.pick_status.json: Update to 3f14fd8578549e34db2f564396f300819b2ff10f
VERSION: bump for 23.1.0-rc3
.pick_status.json: Update to 040aeb5a23e5cc8a71a352e55282d514dd2ab64f
.pick_status.json: Update to 9f522ac0c65ceae11ad1a4e84ec9f32a9393a25c
.pick_status.json: Update to efc94390f716b70ac1d5b09c6f949f938aeadcac
VERSION: bump for 23.1.0-rc4
.pick_status.json: Update to 6d84b34359dcbad477209adb9f9d0592c5a71bb9
.pick_status.json: Update to cb4e4fc5de48886758a26ff19d322947b5abfcec
dzn: fix pointer type mismatch
.pick_status.json: Update to 57afa7c0b12d6d0c9013368853080dfea5b50d07
.pick_status.json: Update to 31e6d15801a9904089aa2913c8eb5a31b79c7dfc
Erico Nunes (5):
lima/ci: Add more piglit unsupported tests to skip
Revert “CI: Lima farm is offline”
lima: don’t use resource_from_handle while creating scanout
lima/ci: restore swap buffers egl tests
Revert “ci: disable lima farm, currently out-of-space, needs to be fixed”
Erik Faye-Lund (54):
zink: whitespace fixup
zink: fix depth-clip disable cap
zink: remove depth_clip_control_missing workaround
radeonsi: respect smoothing_enabled
meson: remove dupliace add_devenv call
meson: remove deprecated osmesa-bits option
meson: remove deprecated dri-drivers option
meson: avoid using deprecated build_root() method
meson: use files() instead of joining paths
freedreno/meson: simplify script-path logic
meson: do not reconstruct ICD paths
anv, hasvk: remove stale TODO-files
zink: correct companies in requirements
zink: remove incorrect trailing comma
meson: remove unused USE_FOO_ASM defines
vulkan: prefer vulkan_core.h over vulkan.h
meson: don’t pass vk wsi args where they don’t belong
Revert “meson: Fix Asahi build on macOS”
zink: prefer vulkan_core.h over vulkan.h
zink: get rid of needless dependency
ci: correct typo in name of linkcheck job
docs: update link to intel optimization reference manual
nir: add a print_internal debug-flag
docs: implement new vk-feat role
docs/zink: use vk-feat role for features
docs/zink: remove some trailing spaces
docs/zink: fixup wording of the GL 4.6 requirements
meson: correct typo in comment
ci: move docs-stuff out of root .gitlab-ci.yml
docs: fixup broken envvar-role syntax
docs: escape a few more strings
docs: fixup broken indentation
docs/zink: mention vk1.2 mirror-clamp feature option
docs/zink: clean up requirements-language
docs: move developers article to main website
docs: remove old thanks-article
docs: prefer http-links over ftp
docs/freedreno: fix turnip-heading level
docs: drop reference to modindex
docs: move old relnotes to _extra directory
docs: use version-number as toctree-title for relnotes
zink: emit terminate for spir-v 1.6
zink: use demote from spir-v 1.6 when possible
zink: use spir-v 1.6 local-size when needed
zink: enable spir-v 1.6 for vulkan 1.3
docs: format code-block as ini
docs: format code-block as toml
docs: make code-block indents consistent
ci: move virgl-rules after intel-rules
virgl/ci: clean up manual rules for virgl
ci: remove unused rules
zink: do not use sampled-image for buffers
nir: fix constant-folding of 64-bit fpow
llvmpipe: fixup refactor copypasta
Faith Ekstrand (99):
nir: Add more opcodes to nir_tex_instr_is_query()
nir/builder: Add some texture helpers
radv: Use the new NIR builder tex helpers for meta
anv: Refactor Android externalFormat handling in CreateYcbcrConversion
anv/android: Use VkFormat for externalFormat
util/format: YUYV and UYVY have 4 8-bit channels
vulkan/formats: Add YCbCr format information
vulkan: Add a common vk_ycbcr_conversion struct
anv: Use the common vk_ycbcr_conversion object
anv: Use the YCbCr format info from common code
nir: Add copyright and include guards to nir_vulkan.h
anv,nir: Move the ANV YCbCr lowering pass to common code
gallium,util: Pull u_indices and u_primconvert back into gallium
mailmap: Remap e-mail addresses for Faith Ekstrand
vtn: Set alignment on initial UBO/SSBO casts
anv: Let spirv_to_nir() set UBO/SSBO base cast alignments
hasvk: Let spirv_to_nir() set UBO/SSBO base cast alignments
intel/compiler: Document wm_prog_key::persample_interp
intel/nir: Lower barycentrics to per-sample in a dedicated pass
nir: Remove nir_lower_io_force_sample_interpolation
intel/compiler: Use SHADER_OPCODE_SEND for PI messages
intel/fs: Return early in a couple builtin setup helpers
intel/compiler: Convert brw_wm_aa_enable to brw_sometimes
intel/fs: Make per-sample and coarse dispatch tri-state
intel/compiler: Convert wm_prog_key::persample_interp to a tri-state
intel/compiler: Convert wm_prog_key::multisample_fbo to a tri-state
intel/fs/validate: Assert SEND [extended] descriptors are uniform
intel/fs: Break out yet another FB write helper
intel/fs: Rework dynamic coarse handling
nir/deref: Preserve alignments in opt_remove_cast_cast()
nir/from_ssa: Use more helpers in resolve_parallel_copies
nir/from_ssa: Only re-locate values that are destinations
nir/from_ssa: Move the loop bounds check in resolve_parallel_copy
nir: Add a load/store bit size lowering pass
intel/nir: Use nir_lower_mem_access_bit_sizes()
Revert “vk/runtime: turn vk.xml extension requirements into asserts”
Revert “vk/util: keep track of extension requirements”
vulkan: Remove unused fields from Extension and ApiVersion
vulkan: Improve extension parsing
vulkan: Parse the platform in Extensions.from_xml()
vulkan: Add a get_all_required() helper
vulkan: Properly filter entrypoints
vulkan: Properly filter by api in enum_to_str
Vulkan: Properly filter structs in vk_cmd_queue_gen
vulkan: Filter out provisional extensions
vulkan: Move the features generator to vulkan/util
vulkan: Properly filter structs in vk_physical_device_features
vulkan/layers: Use PUBLIC instead of VK_LAYER_EXPORT
vulkan/device-select-layer: Include vulkan.h
vulkan: Update the XML and headers to 1.3.241
nir/lower_io: Handle buffer_array_length for more address modes
anv: Drop our manual SSBO size handling
hasvk: Drop our manual SSBO size handling
panvk: Drop our manual SSBO size handling
turnip: Set spirv_options::use_deref_buffer_array_length
lavapipe: Set spirv_options::use_deref_buffer_array_length
v3dv: Set spirv_options::use_deref_buffer_array_length
spirv: Always emit deref_buffer_array_length intrinsics
nir: Check against combined alignment in nir_lower_mem_access_bit_sizes
nir: Add mode filtering to lower_mem_access_bit_sizes
nir: Add UBO support to nir_lower_mem_access_bit_sizes
nir: Add a combined alignment helper
nir: Rename align to whole_align in lower_mem_load
nir: Rename nir_mem_access_size_align::align_mul to align
nir: Make chunk_align_offset const in lower_mem_load()
nir: Handle wider unaligned loads in lower_mem_access_bit_size
intel/nir: Limit unaligned loads to vec4
vulkan/runtime: Rename and document storage image Z range
intel/blorp: Set array_len for 3D images properly
isl: Set Depth to array len for 3D storage images
intel: Use nir_lower_tex_options::lower_index_to_offset
vulkan: Update XML and headers to 1.3.244
vulkan: Provide wrappers for VK_EXT_map_memory2 functions
anv: Limit memory maps to the client-allocated size
anv: Implement VK_KHR_map_memory2
intel/isl: Support Yf/Ys/Tile-64 in isl_surf_get_image_offset_sa
intel/blorp: Drop the TODO file
docs: Fix Faith’s name in relnotes
nir: Drop a bunch of Authors tags
spirv: Drop a bunch of Authors tags
intel: Drop some author comments and update Faith’s name
util,mesa,panfrost: Drop some author tags
vulkan: vk_android.c should be copyright Intel
util: Update some copyright tags
CODEOWNERS: s/jekstrand/gfxstrand
vulkan,anv,hasvk,radv: Add a common vk_image_usage_to_ahb_usage helper
vulkan/android: Fix hardware buffer usage flags
vulkan: Add an ahardware_buffer_format field to vk_image
anv,hasvk: Set vk_image.ahardware_buffer_format
radv: Set vk_image.ahardware_buffer_format
vulkan,anv,hasvk,radv: Unify Android hardware buffer creation
vulkan: Add a vk_device_memory base struct
anv: Use the new vk_device_memory base struct
vulkan: Record pipeline flags in the render pass
vulkan: Plumb rendering flags through vk_graphics_pipeline_state
anv/pipeline: Use feedback loop flags for self-dependencies
hasvk/pipeline: Use feedback loop flags for self-dependencies
vulkan: Drop vk_render_pass_state::*self_dependenc*
vulkan: Drop VkRenderingSelfDependencyInfoMESA
Felix DeGrood (10):
intel/perf: Hide extended metrics by default
anv: cs_stall during compute state flush on < gen12.5
anv: only emit CFE_STATE when scratch space increases
anv: set CFE_STATE.OverDispatchControl to default
iris: report draw count for perfetto
anv/blorp: support surf generation for addresses
anv/blorp: implement anv_cmd_buffer_fill_area
anv/blorp: add flush reasons to RT flushes
anv: reset query pools using blorp
anv: disable reset query pools using blorp opt on MTL
Filip Gawin (2):
crocus: don’t quantize the clear value
nine: add fallback for D3DFMT_D16 in d3d9_to_pipe_format_checked
Francisco Jerez (11):
intel/fs/gfx12: Ensure that prior reads have executed before barrier with acquire semantics.
intel/disasm/gfx12+: Use helper instead of hardcoded bit access for 64-bit immediates.
intel/disasm/gfx12+: Fix print out of non-existing condmod field with 64-bit immediate.
intel/eu/gfx12+: Implement decoding of 64-bit immediates.
intel/fs/gfx12+: Drop redundant handling of SHADER_OPCODE_BROADCAST in exec pipe inference.
intel/fs: Fix src and dst types of LOAD_PAYLOAD ACP entries during copy propagation.
intel/eu/gfx8-9: Fix execution with all channels disabled due to HW bug #220160235.
intel/rt: Fix L3 bank performance bottlenecks due to SW stack stride alignment.
intel/fs: Track force_writemask_all behavior of copy propagation ACP entries.
intel/fs: Fix copy propagation dataflow analysis in presence of force_writemask_all ACP overwrites.
intel/fs: Fix register coalesce in presence of force_writemask_all copy source writes.
Frank Binns (7):
pvr: small cleanups
pvr: remove start/stop transfer flags
pvr: stop restricting the compiler to the Sascha Willems triangle demo
pvr: remove duplicate define
pvr: initialise size for placeholder “zeroed” shaders
pvr: replace nop binary shader with run-time compiled shader
pvr: fix clang-format issue
Friedrich Vock (26):
radv/rt: Divide by the correct workgroup size
radv/bvh: Prevent NANs when computing node cost
radv/rmv: Also check the other pid field
radv/rmv: Avoid more CPU unmap deadlocks
radv/rmv: Log bo destruction before freeing it
radv/rmv: Correct timestamp shifting
vulkan/rmv: Use the timestamp divisor instead of a hardcoded value
vulkan/rmv: Remove delta parameter from dump helpers
mesa: Report GL_SHADER_BINARY_FORMAT_SPIR_V as supported
docs: Fix formatting for RMV tracing docs
radv: Extend hit attribute lowering for LDS
radv: Use LDS for closest-hit hit attributes
radv: Emit RT shader VA user SGPR
radv/rt: Add shader config combination/postprocessing utils
radv: Add RT shader stage names for executable properties
aco: Swap operands for v_and_b32 in RT prolog
radv/rt: Also adjust the SGPR count in postprocess_rt_config
aco: Un-swap addressable VGPRs/SGPRs in RT prolog
radv: Work around use-after-free compiler errors
radv: Add RT stages to radv_mesa_to_rgp_shader_stages
radv/rmv: Fix creating RT pipelines
radv/rmv: Fix import memory
radv/rt: Plug some memory leaks during shader creation
radv: Don’t leak the RT prolog binary
radv: Always call si_emit_cache_flush before writing timestamps
radv: Add driconf to always drain waves before writing timestamps
GH Cao (1):
gallium: Add MCJIT target triplet for Windows ARM64
Ganesh Belgur Ramachandra (1):
ac/nir: fix CDNA image lowering for array textures
Georg Lehmann (81):
Revert “aco: Combine v_cvt_u32_f32 with insert to v_cvt_pk_u8_f32.”
aco: use s_bfm_64 for constant copies
aco: use s_pack_ll_b32_b16 for constant copies
aco: Improve wave64 cycle estimates.
aco: fix imod/omod for gfx11 VOP3 opcodes
aco: add mov/cndmask opcodes to does_fp_op_flush_denorms
aco: don’t allow output modifiers for v_cvt_pkrtz_f16_f32
aco: allow output modifiers for ldexp_f16
aco: don’t list imod/omod support v_fmaak_f32/v_fmamk_f32
aco: support omod/imod for v_fmac_f16
aco: remove stale TODOs about v_interp opsel
aco: new 16bit VOP3 opcodes can use opsel
aco: Don’t use vcmpx with DPP.
aco: combine a ^ ~b and ~(a ^ b) to v_xnor_b32
amd,nir: remove byte_permute_amd intrinsic
nir: change 16bit image dest folding option to per type
amd: don’t use d16 for integer loads
amd: d16 uses rtz conversion for 32bit float
aco: use v_permlane(x)16_b32 for masked swizzle
aco/gfx11: use dpp_row_xmask and dpp_row_share
aco: use and swizzle mask in dpp quad perm
aco/optimizer_postRA: assume all registers are untrackable in loop headers
nir/opt_algebraic: add patterns for iand/ior of feq/fneu with 0
aco: mark mad definition as precise if the mul/add were precise
aco: use v_fma_mix_f32 for v_fma_f32 with 2 fp16 representable, different literals
nir/lower_mediump: don’t use fp16 for constants if the result is denormal
aco: treat VINTERP_INREG as VALU
aco/ir: rework IR to have one common valu instruction struct
aco/ra: set opsel_hi to zero when converting to VOP2
aco: validate VALU modifiers
aco/print_ir: simplify using VALU instruction
aco/optimizer: simplify using VALU instruction
aco: remove VOP[123C]P? structs
aco: add bitfield array helper classes
aco: use bitfield array helpers for valu modifiers
aco/assembler/gfx11: simplify 16bit VOP12C promotion to VOP3
aco/optimizer: don’t reallocate instruction when converting to VOP3
aco: don’t reallocate fma{mk,ak,_mix} instruction
aco: copy abs/neg with assignment
aco: use integer access for neg_lo/neg_hi
aco: use array indexing for opsel/opsel_lo/opsel_hi
aco: access neg/abs as int in usesModifiers
aco: use bitfield_array for temporary neg/abs/opsel
nir: optimize i2f(f2i(fsign))
aco: remove duplicates from .clang-format
amd: remove duplicate from .clang-format
aco: don’t check usesModifiers for pseudo instructions
aco: fix p_interp_gfx11 comment
aco: make .clang-format usable with tests
aco/ir: fix copy paste bug in convert_to_SDWA
aco/util: override default assignment operator for bitfield helpers
aco: clean up to_mad_mix
aco/ra: don’t reallocate VOP3 instruction for non-vcc lane mask
aco/vn: hash opsel for VOP12C
aco/assembler: support VOP12C opsel
aco: validate VOP12C opsel
aco/to_hw_instr: use VOP1 opsel for v_mov_b16
aco/ra: prepare for VOP12C opsel
aco/optimizer: preserve opsel when fusing fma
aco: handle opsel in combine_comparison_ordering
aco: handle opsel in combine_ordering_test
aco: handle opsel in combine_constant_comparison_ordering
aco: update match_op3_for_vop3 for VOP12C opsel
aco: support v_cvt_f32_f16 with opsel in combine_mad_mix
aco: support neg(mul)/abs(mul) optimization in more cases
aco: return true in usesModifiers for VOP12C with opsel
aco: swap opsel when swapping VOP2/C operands
aco/ir: copy opsel when converting to DPP
aco: don’t label mul with opsel as abs/neg
aco/gfx11: allow opsel for VOP12C
aco/optimizer: use opsel for VOP12C
aco: keep label_mul/usedef/minmax in apply_extract
aco/optimizer: remove to_SDWA
aco: add tests for fma with opsel
aco: add tests for dpp with opsel
aco: add tests for swap operand with opsel
aco: add tests for cmp ordering with opsel
aco: add test for min/max combining with opsel
aco/tests: run optimize.mad_mix.input_conv.modifiers on gfx11
aco: add tests for neg(mul) with opsel
aco/tests: add missing dependency on generated header
Gert Wollny (49):
glsl/nir: only set uses_sample_shading when the output is a fbfetch
nir: Add possibility to store image var offset in range_base
nir: Add range_base to atomic_counter and an option to use it
ntt: handle the image intrinsic range_base when translating to TGSI
ntt: Make use of the range_base offset when translating atomics in NTT
virgl: lower image variable offsets into the intrinsic range_base value
virgl: Request setting the atomic offset in the range_base
virgl: drop the separable flag for cases that can’t be handled
r600/sfn: Fix readport check
r600/sfn: Do a bit of cleanup with the secondary read port validation
r600/sfn: Fix opcode and result dest slot mask for variable size dot
r600/sfn: Fix splitting of multislot alu ops
virgl: remove unused virgl_encoder_inline_write
r600/sfn: Use range_base for atomics and images
r600/sfn: Work around dependency issue when splitting op to group
r600/sfn: drop useless instr use count
r600/sfn: Fix a typo
r600/sfn: Silence warnings about unused parameters
r600/sfn: Don’t copy propagate indirect loads to more than one dest
r600/sfn: Stop try scheduling in t-slot with empty related v-slot
r600/sfn: rename texture coordinate offset for clarity
r600/sfn: address use in group only if instr can be added
r600/sfn: Forward setting the block ID and index
r600: Don’t start new CF for every fetch through tex clause
r600/sfn: Fix handling of fetch through texture clause
r600/sfn: Fix alu trans op flag setup
r600/sfn: Fix Cayman trans from string and add test for copy prop
vulkan/wsi: Take Xwayland into account for x11_min_image_count
zink/kopper: Add extra swapchain images for Venus
r600/sfn: be more conservative with channel use in multi-slot ops
r600/sfn: Fix readport cylce map
r600/sfn: Fix minimum required registers
r600/sfn: Add AluGroup method to update readport validation from scratch
r600/sfn: Split AluInstr replace_source into test and actual replace
r600/sfn: Add method to AluGroup to replace sources
r600/sfn: Add print method to AluReadportValidation
r600/sfn: redirect copy propagation to alu parent group
r600/sfn/tests: Add a test for the copy prop into a group
r600/sfn: Fix atomic lowering
virgl: Enable AMD_vertex_shader_(layer|viewport_index) when host supports it
virgl: Don’t try to do re-alloc or readback by transfer for blob resources
ntt: add option to lower SSBO bindings to buffer index
virgl: Lower binding start into buffer indices
r600/sfn: fix container allocators
r600/sfn: Lower tess levels to vectors in TCS
r600/sfn: make sure f2u32 is lowered late and correctly for 64 bit floats
r600/sfn: assign window_space_position in shader state
r600/sfn: Ass support for image_samples
r600/sfn: fix cube to array lowering for LOD
Giancarlo Devich (25):
d3d12: Use varying comparison function for TESS stage key compare
d3d12: Add unions to encompass shader key stage vars, use in hashing
nir: Check sampler_binding is valid when lowering tex shadow
d3d12: Don’t clear d3d12_shader_key
d3d12: Move d3d12_context_state_table_entry to d3d12_resource_state.h
d3d12: Assign up to 16 simultaneously active contexts unique IDs
d3d12: Track up to 16 active context resource states locally in d3d12_bo
d3d12: Don’t recompute has_flat_varyings or missing_dual_src_outputs
d3d12: Track max varying slot, set and compare less bytes
d3d12: Don’t unnecessarily zero out gs/tcs keys
d3d12: Don’t memcmp gs/tcs keys
d3d12: Create varying structures as necessary, reference them
d3d12: Don’t loop in update_draw_indirect_with_sysvals
d3d12: Compare shader keys with a switch, instead of cascading if’s
d3d12: Compare shader keys with union-encompassing fields all at once
d3d12: Compare shader key common parts with memcmp, instead of if’s
d3d12: Cache varying info to reduce compare/copy cost
d3d12: Use memcmp for full tcs/gs variant keys
d3d12: Track up to 16 contexts worth of pending barriers locally in bos
d3d12: Don’t unnecessarily recompute manual_depth_range
d3d12: Use context-level sampler_state array for filling shader keys
d3d12: Use short circuit in shader key compare; update key hash
d3d12: Reduce gs variant key init cost; unnecessary validate gs calls
d3d12: Unroll shader variant selection loop
d3d12: Track up to 16 contexts worth of batch references locally in bos
Guilherme Gallo (24):
radeonsi/ci: Update stoney test expectations
radeonsi/ci: Skip slow traces on raven
Revert “ci: disable Collabora’s LAVA lab for maintance”
ci/lava: Move LAVA dependencies to pip
ci/lava: Add LavaFarm class to find LAVA farm from runner tag
ci/lava: Fix LAVA logs issues for Collabora jobs
ci: Upload debian-release artifact to S3
ci: Create debian-arm64-release job
ci: Use release builds in perf jobs
ci: Use workflow to make CI aware of performance jobs
ci: Reuse MESA_CI_PERFORMANCE_ENABLED in performance-rules
ci: Handle carriage return characters in LAVA logs
ci: Fix release build use for performance jobs
ci/baremetal: Wrap artifact download curl with xtrace
ci: Improve piglit-traces “no-perf” filter
ci: Fix freedreno-rules-performance
ci: Add piglit traces hidden jobs
ci/freedreno: create a618-traces and perf jobs
ci/zink: Add zink-a618 trace jobs
ci/zink: Add zink-turnip-manual-rules
ci/zink: Add zink-tu-a618-traces-performance job
ci/zink: Fix zink-tu-a618-traces perf job rules
ci/zink: Reduce zink-tu-a618-traces parallelism
Revert “ci: disable Collabora’s LAVA lab for maintance”
Hampus Linander (4):
nir: Add extr_agx opcode
agx: Add extr instruction to AGX backend
agx: Use AGX extr for tex lowering
agx: Optimize lower_resinfo for cube maps
Hans-Kristian Arntzen (13):
radv: Fix invalid 64-bit shift.
radv: Fix missing VK_ACCESS_2_SHADER_SAMPLED_READ_BIT.
radv: Implement VK_ACCESS_2_DESCRIPTOR_BUFFER_READ_BIT_EXT.
wsi/common: Add common implementation of vkReleaseSwapchainImagesEXT.
wsi/x11: Implement EXT_swapchain_maintenance1.
wsi/common: Implement swapchain present fence.
wsi/common: Add comment about DEFERRED_ALLOCATION_BIT_EXT.
wsi/common: Add function to modify present mode.
wsi/wayland: Implement EXT_swapchain_maintenance1.
wsi/display: Implement EXT_swapchain_maintenance1.
wsi/win32: Implement VK_EXT_swapchain_maintenance1.
radv: Expose VK_EXT_swapchain_maintenance1.
wsi/x11: Fix present ID signal when IDLE comes before COMPLETE.
Harri Nieminen (6):
docs/specs: Fix typos
docs/gallium: Fix typos
docs/freedreno: Fix typos
docs/panfrost: Fix typo
docs/svga3d: Fix typo
bin: Fix typos
Helen Koike (12):
ci/debian-android: move pkgconfig paths to the cross file
ci: move patches to patches directory
android: allow system = ‘android’ on cross file
ci/android: move sdk version and ndk to a job variable
ci: compile deqp for android
ci: compile deqp-runner for android
ci: debian-android compile virgl
ci: export artifacts from debian-android
ci/android: add android to the ci
android/ci: fix removal of inexistent file
android/ci: Fix call to adb
android/ci: raise error on script when not related to the tests
Hyunjun Ko (1):
vulkan/runtime: match the spec when taking pipeline subsets.
Iago Toral Quiroga (20):
broadcom/compiler: produce better code for f2f16 with RTZ rounding
v3dv: add paths to handle partial copies of linear images
v3dv: drop unused field from v3dv_cmd_buffer
v3dv: increase BO allocation size when growing CLs
v3dv: ensure we allocate at least the requested space for a CL
v3dv: add a cl_advance_and_end helper
v3dv: ensure at least V3D_CL_MAX_INSTR_SIZE bytes in last CL instruction
v3dv: ensure we apply binning syncs to secondary command buffers
v3dv: fix stencil view aspect selection of depth/stencil image
v3d: support r{g,gba}16f formats for vertex buffers
broadcom/compiler: track pending ldtmu count with each TMU lookup
v3dv: pause occlusion queries during vkCmdClearAttachments
v3dv: fix format swizzle for buffer views
v3dv: drop unused parameter
v3dv: always acquire display device before checking if we can present
vulkan/wsi/display: set pDisplay to NULL on error
v3d,v3dv: stop trying to force 16-bit TMU output for shadow comparisons
broadcom/compiler: fix v3d_qpu_uses_sfu
broadcom/compiler: add a v3d_qpu_instr_is_legacy_sfu helper
broadcom/compiler: fix incorrect check for SFU op
Ian Romanick (60):
ntt: Add support for fcsel_gt and fcsel_ge opcodes
nir/lower_int_to_float: Add support for i32csel opcodes
r300: Enable generation of fcsel_gt and fcsel_ge opcodes
i915: Enable generation of fcsel_gt and fcsel_ge opcodes
gallium/draw: Enable aapoint NIR helpers to generate bool1, bool32, or float32 Booleans
gallium/draw: Enable polygon stipple NIR helpers to generate bool1 or bool32 Booleans
nir/builder: Eliminate nir_f2b helper (and use of nir_f2b32 helper)
nir/builder: Handle f2b conversions specially in nir_type_convert
nir: Eliminate nir_op_f2b
lavapipe: Fix bad array index scale factor in lvp_inline_uniforms pass
lavapipe: Only check NULL pointers in one place in src_only_uses_uniforms
nir/inline_uniforms: Change num_offsets type to uint8_t
nir/inline_uniforms: Pass max_num_bo and max_offset around as parameters
nir/inline_uniforms: Allow possibility of more than one UBO
nir/inline_uniforms: Allow possibility of uni_offsets and num_offsets being NULL
nir/inline_uniforms: Make src_only_uses_uniforms public, change name
nir/inline_uniforms: Make add_inlinable_uniforms public
nir/inline_uniforms: Add inot condition support
nir/tests: Don’t unconditionally log shaders from this one CF test
nir/tests: Refactor creation of loops for loop_analyze test cases
nir/tests: Add tests for “inverted” loops
nir/tests: Add tests for nir_loop_info::induction_vars tracking
nir/loop_analyze: Track induction variables with uniform increments
nir/loop_analyze: Use nir_loop_variable::update_src instead of nir_basic_induction_var::alu
nir/loop_analyze: Use nir_loop_variable::init_src instead of nir_basic_induction_var::def_outside_loop
nir/loop_analyze: Eliminate nir_basic_induction_var
nir/loop_analyze: Track induction variables with uniform initializer
nir/loop_analyze: Simplify some logic in compute_induction_information
nir: ifind_msb_rev can only have int32 sources
intel/compiler: Lower find_lsb in NIR
nir: intel/compiler: Move ifind_msb lowering to NIR
intel/compiler: Tighter src and dest size bounds checking for some opcodes
nir/algebraic: Only lower ufind_msb with 32-bit sources
nir: intel/compiler: Move ufind_msb lowering to NIR
nir/builder: Do not generate 8- or 16-bit find_msb
nir/algebraic: Do not generate 8- or 16-bit find_msb
nir: Restrict ufind_msb and ufind_msb_rev to 32- or 64-bit sources
nir/algebraic: Optimize some ifind_msb to ufind_msb
nir/lower_int64: Optionally lower ufind_msb using uadd_sat
intel/fs: Don’t copy propagate from saturate to sel
nir/algebraic: Undistribute fsat from fmax
intel/fs: Output opt_combine_constants debug to stderr
intel/fs: Refactor part of opt_combine_constants to a separate function
intel/fs: Rework the loop of opt_combine_constants that collects constants
intel/compiler: Remove one overload of backend_instruction::insert_before
intel/compiler: Use NIR_PASS instead of NIR_PASS_V
intel/compiler: Micro optimize inst_is_in_block
intel/fs: Use specialized version of regions_overlap in opt_copy_propagation
intel/compiler: Micro optimize regions_overlap
intel/fs: Linked list micro optimizations in brw_nir_move_interpolation_to_top
intel/fs: Preserve meta data more often in brw_nir_move_interpolation_to_top
intel/fs: White space fixes
nir/tests: Add many loop analysis tests for induction vars updated by shifts
nir/tests: Add more loop analysis tests for induction vars updated by shifts
nir/tests: Add many loop analysis tests for induction variables modified by imul
nir/loop_analyze: Add a function to evaluate an ALU as constant
nir/loop_analyze: Track induction variable basis information
nir/loop_analyze: Change invert_cond instead of changing the condition
nir/loop_analyze: Use try_eval_const_alu and induction variable basis info
nir/tests: Port almost all loop_analyze tests to new macro-based infastructure
Ikshwaku Chauhan (1):
radeonsi: Fix distortion for yuv422 format for GFX10.
Illia Abernikhin (1):
util: Extend vk_enum_to_str with bitmasks vk_enum_to_str only generates literals for enums with type: @type=”enum”, but many enums have type: @type=”bitmask” and were not taken into account here.
Illia Polishchuk (4):
ANV: Add extra memory types for ANV driver instead of a single one
hasvk: Add extra memory types for hasvk driver instead of a single one
nir: Add sha1 hash for nir shaders converted from spir-v
glx: fix indirect initialization crash
Ilya K (1):
intel/vk/grl: don’t install libgrl.a
Isaac Bosompem (1):
tool/pps: Fix 32-bit build issue with format string
Isabella Basso (5):
nir/algebraic: insert patterns inside optimizations list
nir/algebraic: extend mediump patterns
nir/algebraic: extend lowering patterns for conversions on smaller bit sizes
nir/algebraic: make patterns for float conversion lowerings imprecise
nir/algebraic: remove duplicate bool conversion lowerings
Italo Nicola (10):
panfrost: fix off-by-one when exporting format modifiers
panfrost: fix tiny sample_positions BO memory leak
hud: use defines for default scale/rotation/visibility values
hud: add GALLIUM_HUD_OPACITY envvar
panfrost: fix strict-aliasing violations when packing fb ptrs
etnaviv: abort() instead of assert(0) on compiler error
etnaviv: use stderr for compiler error logging
etnaviv: add default clear_buffer and clear_texture APIS
etnaviv: lower (un)pack_{2x16,2x32}_split and extract_{byte,word}
etnaviv: implement nir_op_uclz and lower find_{msb,lsb} to uclz
Iván Briano (7):
anv: uncompressed views of compressed 3d images are now valid
vulkan: track the right value on CmdSetColorWriteMasks
anv: fix testing for dynamic color blend bits
anv: stop tracking color blend state in the pipeline
anv: use the parameter passed to the macro
intel/fs: handle interpolation modes for at_sample and at_offset too
vulkan/wsi/display: do not dereference a NULL pointer
Jakub Kulík (1):
mesa: Fix format transform on big endian platforms
Jan Beich (1):
util/u_process: implement util_get_command_line for BSDs
Janne Grunau (1):
asahi: Fix typo in debug/error message helper macro
Jarred Davies (16):
pvr: Use common queue submit implementation
pvr: Add support for VK_KHR_timeline_semaphore
pvr: Enable threaded submit when supported
pvr: Clear wait syncs after job submission
pvr: Don’t update fragment signal sync when fragment stage is disabled
pvr: Fix segfaults when pDepthStencilAttachment is NULL
pvr: Generate EOT program at runtime
pvr: Generate dummy emit for renders without any emits
pvr: Add support for multiple emits from EOT program
pvr: Select a single aspect format for the texture state of DS image views
pvr: Add initial support for VK_FORMAT_S8_UINT
pvr: Don’t allocate/upload 0 size coeff programs
pvr: Always mark robustBufferAccess as supported
pvr: Rename pvr_xgl_pds.c to pvr_pipeline_pds.c
pvr: Add robustness buffer support
pvr: Mark all normalized formats as supporting with_packed_usc_channel
Jesse Natalie (224):
ci/windows: Download updated WARP 1.0.4 package
dzn/ci: Remove flakes/fails that don’t hit anymore
dzn/ci: Add image test group, which is all passing now
dzn: Fix clear bind flag logic
microsoft/compiler: Lower pack_[u/s]norm_2x16
microsoft/compiler: Implement texture sample count query
microsoft/compiler: Remove arrays when testing for structs in I/O
microsoft/compiler: Always emit float types in the I/O signature for structs
microsoft/compiler: Re-work the logic for adding SV_SampleIndex to force sample-rate
microsoft/compiler: Use nir info.fs.uses_sample_shading to force sample-rate
microsoft/compiler: Set num_components to 4 when updating pos write instructions
spirv2dxil: For removing unused vars, consider the whole I/O var size
spirv2dxil: When removing unused inputs, make sure they’re actually inputs
spirv2dxil: Allow killing position as an undef varying
spirv2dxil: Replace not-provided inputs with zero instead of undef
dzn: Get options13
dzn: Support alpha blend factor
dzn: When changing root signature, dirty descriptors too
dzn: Use R24G8_TYPELESS for 24/8 depth resources
dzn: Support int border colors
dzn: Storage buffer sizes need to be 4-byte-aligned
dzn: Set MultisampleEnable to enable MSAA lines
dzn: Use typeless format for creation of depth-only or stencil-only D24S8
dzn: Define a symbol that was present in older D3D headers
dzn: Support root signature 1.2
dzn: Support unnormalized coordinate samplers
dzn: Always align cached pipeline header size to input element align
dzn: Add a zeroed zsa state when depth or raster is disabled
dzn: Disable depth when the rasterizer is disabled due to no position output
dzn: Fix format support checks for storage/uniform texel buffers
dzn: Remove cmdbuf query ‘wait’ list
microsoft/compiler: Delete incorrect implementation for load_layer_id
microsoft/compiler: Subpass textures are supposed to be arrays
microsoft/compiler: Delete stale TODO comment
microsoft/compiler: Support view instancing
spirv2dxil: Pass runtime conf struct to lower_shader_system_values
spirv2dxil: Implement lowering for multiview
spirv2dxil: Claim multiview support
dzn: Put nir compilation options in a struct
dzn: Handle multiview pipeline creation
dzn: Handle draws and clears for multiview rendering
dzn: Implement multiview queries
dzn: Enable multiview
dzn: Enable independent blending
dzn: Delete an unnecessary assert
dzn: Rework meta blit VS
microsoft/compiler: Add an overload param to unary function helpers
microsoft/compiler: Implement a few basic wave/subgroup intrinsics
microsoft/compiler: Add lowering passes for basic subgroup vars
spirv2dxil: Use 32-bit shared offsets
spirv2dxil: Support basic subgroups
dzn: Support basic subgroups
microsoft/compiler: Fix atomic image umax
microsoft/compiler: Lower device index to zero
spirv2dxil: Support dispatches with base group indices
dzn: Support vkCmdDispatchBase
dzn: Use common physical device list/enumeration helpers
dzn: Respect suspending/resuming flags to omit clears/resolves
dzn: Set dynamic rendering caps
dzn: When rendering to 3D, don’t treat layers as subresources for barriers
dzn: Move patched vertex buffer capability check up a level
dzn: Use SHADER_LOAD to indicate SAMPLED_IMAGE support
dzn: Usage image view usage instead of image usage
dzn: Support EXTENDED_USAGE bit
dzn: Usage MULTISAMPLE_LOAD support instead of RT/DS support for MSAA
dzn: Descriptor limits are based on binding tier, not heap tier
dzn: A single sampler descriptor set needs to support 1024 samplers
dzn: Don’t expose variable pointers
dzn: Fix independent blend check
dzn: Enable Vulkan 1.1
microsoft/compiler: Don’t emit threadgroup barriers for graphics shaders
microsoft/compiler: Handle i2i1 and u2u1
microsoft/compiler: Handle i1 overloads
microsoft/compiler: Implement more wave/quad ops
microsoft/compiler: Support emitting the SM6.6 wave size tag
spirv2dxil: Lower some wave op properties
spirv2dxil: Support subgroup SPIR-V caps
dzn: Support more subgroup/quad ops
dzn: Implement subgroup size control extension
dzn: Use core feature matching logic instead of rolling our own
microsoft/compiler: Support float controls
dzn: Fix dynamic rendering clear load op for non-multiview
dzn: Handle separate stencil usage
dzn: Cache GPUVA for buffers
dzn: Support float control
dzn: Always do clears with copies on non-graphics queues
dzn: Enhanced barriers fixes/workarounds
dzn: Ensure we don’t mix DSV+simultaneous-access
dzn: Support Vulkan 1.2
dzn: Fix Windows WSI
dzn: Don’t recursively lock the physical device enum mutex
dzn: Report as a software device for non-Windows
CI/windows: Don’t limit deqp-runner to 4 jobs
CI/windows: Apply CI_FDO_CONCURRENT to piglit too
dzn: Consider linked shaders when computing DXIL hash
wsi/win32: Always use non-SRGB formats for DXGI
wsi/win32: Use app-provided timeout instead of arbitrary hardcoded value
CI: Lima farm is offline
dzn, driconf: Add a driconf entry for NMS to claim wide line support
vulkan/wsi: Add a wsi_device param to get_present_modes
vulkan/wsi/win32: Support tearing (immediate) and VSync (FIFO) present modes
wsi/win32: Don’t require buffer blits for software drivers
wsi/win32: We don’t need a window DC for DXGI
clc: Include opencl-c-base.h with LLVM 15 (using builtins)
microsoft/clc: Set features that are used by CL tests
ci/windows: Update LLVM to 15
nir: Add alignment to load_push_constant
nir_lower_fp16_casts: Allow opting out of lowering certain rounding modes
microsoft/compiler: Handle struct consts in DXIL module dumper
microsoft/compiler: Handle frcp for float16/float64
microsoft/compiler: Ensure native_low_precision is set for 16-bit bitcasts/stores
microsoft/compiler: Handle undef-rounding f2f16 as rtz
microsoft/compiler: Move unaligned load/store pass from CL
microsoft/compiler: Pass deref modes to unaligned pass and handle push const
microsoft/compiler: Simplify bitpacking for load/store lowering with nir_extract_bits
microsoft/compiler: Pass an alignment to constant buffer load lowering
microsoft/compiler: Handle 48-bit stores to SSBO/shared
microsoft/compiler: Support raw buffer load/store intrinsics with 16bit alignment
microsoft/compiler: Support lowering SSBO accesses to 16bit vectors
spirv2dxil: Set min UBO/SSBO alignments
spirv2dxil: Lower unaligned loads and stores
spirv2dxil: Move shader model into runtime conf struct
spirv2dxil: Support 16bit types
dzn: Enable get_surface_capabilities2
dzn: Delete unused extensions table
dzn: Get options4
dzn: Enable 16bit types when supported
dzn: Enable KHR_storage_buffer_storage_class
vulkan/wsi: Fix Windows build
radv: Fix returning an expression from a void function
Revert “CI: Disable Windows runners”
nir: Propagate alignment when rematerializing cast derefs
microsoft/compiler: Implement wave reduce/exclusive scan ops that are supported
microsoft/compiler: Add a lowering pass for scan ops that aren’t supported
spirv2dxil: Handle arithmetic subgroup ops
dzn: Claim the arithmetic subgroup bit
ci/windows: Update warp to 1.0.5
microsoft/compiler: Handle writable buffer UAV size queries
d3d12: Report correct texel buffer max size
d3d12: Fix buffer SRV/UAV creation
d3d12: Remove now-unused UAV format from shader info
microsoft/compiler: Fix setting bit 31 in feature flags
microsoft/compiler: Only set typed UAV load feature bit for multi-comp loads
microsoft/compiler: Refactor type -> resource kind helper
microsoft/compiler: Add helpers for getting res_props structs
microsoft/compiler: Split handle annotation into two parts
microsoft/compiler: Handle “bindless” image/tex sources as heap indices
microsoft/compiler: Support descriptor heap indexing for UBO/SSBO
microsoft/compiler: Use store_dest instead of store_dest_value more
microsoft/compiler: Update header docs for binding modes supported by compiler
spirv2dxil: Add a pass to lower deref tex/image and vulkan ubo/ssbo to bindless
spirv2dxil: Only lower readonly images to SRVs when the option is set
spirv2dxil: Support descriptor indexing capabilities
dzn: Remove device pointers from descriptor heaps
dzn: Remove descriptor heap type from descriptor heap wrapper
dzn: Fix a leak in descriptor set layout creation
dzn: Add some docs around descriptor sets and remove redundant/unused data
dzn: Put UAVs first for storage images/buffers in descriptor tables
dzn: Consistently order depth formats before stencil
dzn: Don’t use plane slice 1 for depth+stencil SRVs
dzn: Set up SRV descs for 3D textures correctly
dzn: Skip setting up UAVs for depth resources
dzn: Add initial bindless infrastructure
dzn: When bindless, only allocate one descriptor per layout entry
dzn: Remove defragmenting of descriptor pools
dzn: Delete unused function
dzn: Allocate descriptor sets in buffers for bindless mode
dzn: Don’t dirty bindings if root signature doesn’t change
dzn: Use separate dirty bits for descriptor sets/dynamic buffers
dzn: Bind buffers for bindless descriptor sets
dzn: Add a binding classification in the pipeline layout remapping
dzn: When binding a bindless root signature, bind descriptor heaps first
dzn: Ensure root signatures are re-bound after a meta op
dzn: Only bind descriptor sets up to the used amount of the current layout
dzn: Apply bindless lowering when compiling pipelines
dzn: Add a debug option for enabling bindless mode
dzn: Support descriptor indexing via bindless
dzn: Enable variable size bindings
dzn: Use mesa_loge for DXIL validation errors
microsoft/clc: Add shader model / validator to compiler API
d3d12: Move forward-front-face pass to common DXIL code
spirv2dxil: Expose yz flip pass to external callers
dzn: Add a helper to generate triangle->point GS
dzn: Handle polygon point mode
dzn: Claim fillModeNonSolid
CI/windows: Update headers and Agility redist to 1.710.0-preview
CI/windows: Increase timeout for build container job
microsoft/compiler: Fix 8-bit loads and stores when supporting 16-bit DXIL
microsoft/compiler: Fix barrier for wave ID computation
microsoft/compiler: Assign 1D wave IDs based on local thread ID
microsoft/compiler: Fix large shifts
spirv2dxil: Add some more supported caps
dzn: Add a driconf entry for enabling 8bit loads and stores
dzn: Add a driconf option for enabling subgroup ops in VS/GS
dzn: Fix SRV barrier state on compute command lists
dzn: Raise max number of descriptor sets to 8
dzn: Report some more caps correctly that are supported
dzn: Align descriptor sets in the bindless buffer
dzn: Ensure pipeline variants are used for dynamic stencil masks
dzn: Don’t use write-combine memory for cache-coherent UMA
dzn: Ensure buffer offsets are aligned
dzn: Attempt to force depth write states for depth access in LAYOUT_GENERIC
dzn: Don’t do initial-layout barriers for simultaneous-access resources
dzn: Batch command lists together
dzn: Fix bindless descriptor sets with multiple dynamic buffers that need custom descriptors
dzn: Early-out on no-op barriers
dzn: Clean up ABI helpers now that we require DirectX-Headers 606
dzn: Use GetResourceAllocationInfo3 for castable formats
dzn: Don’t leave deleted physical devices in the instance pdev list
dzn: Remove skips now that WARP is faster
dzn: Support >2K samplers with bindless
dzn: Remove xfail for test that passes (if run)
microsoft/compiler: Don’t split loads/stores that will be split by lower_explicit_io
dzn: Changes to descriptor set dirty flag handling
dzn: Use a linear allocator for upload data on command buffers
dzn: Ignore unnormalized sampling flag if driver doesn’t support it
dzn: Never set STATE_RENDER_TARGET on a compute command list
dzn: Don’t enable bindless by default
d3d12: Support creating PSOs with no attachments with MSAA without TIR
d3d12: Fix buffer reference leak for SO count staging buffer
dzn: Handle mismatches in bound descriptor set vs pipeline layout
d3d12: Respect buffer offsets for sampler views
dzn: Hook up subgroup size to compute shader compilation
dzn: Delete queue-level event waits
Jonathan Gray (3):
egl/dri2: avoid undefined unlocks
intel/dev: remove invalid EHL pci id
intel/dev: Add another EHL pci id
Jonathan Marek (1):
turnip: fix use of align() instead of util_align_npot() with tile_align_w
Jordan Justen (8):
intel/vk/grl: Don’t include anv_private.h in genX_grl.h
intel/vk/grl: Allow genX_grl.h to be included by C++ files
intel/vk/grl: Allow grl/grl_cl_kernel.h to be included by C++ files
intel/vk/grl: genX-ify genX_grl_uuid.cpp
intel/vk/grl: genX-ify grl_cl_kernel_name()
intel/dev: Enable MTL PCI ids
intel/compiler: Support fmul_fsign opt for fp64 when int64 isn’t supported
intel/compiler/gfx12.5+: Lower 64-bit cluster_broadcast with 32-bit ops
Joshua Peisach (1):
gallum/asahi: fix memory leak in agx_resource_from_handle
José Fonseca (4):
llvmpipe: Ensure floating point SSE state is reset regardless of the write mask.
llvmpipe: Honor zero sample_mask when multisample is disabled.
trace: Don’t use italic escape code.
wgl: Fix unintentional assignment on assert.
José Roberto de Souza (107):
anv: Start to move i915 specific code from anv_device to i915/anv_device
anv: Export anv_exec_batch_debug() and chain_command_buffers()
anv: Split i915 code from anv_batch_chain.c
anv: Move anv_device_check_status() code to i915/anv_device.c
intel/dev: Export functions that will be used by different kernel drivers
intel/dev: Move i915 code to i915/intel_device_info.c
intel/dev: Split hwconfig i915 specific code
intel/dev: Detect what is the kernel mode driver loaded
intel: Add intel_kmd_type parameter to intel_engine_get_info()
intel: Add kmd_type parameter to necessary intel_gem.h functions
anv: Nuke anv_queue:index_in_family
hasvk: Nuke anv_queue:index_in_family
intel/ds: Nuke intel_ds_queue::queue_id
intel/ds: Fix crash when allocating more intel_ds_queues than u_vector was initialized
intel/genxml/gen125: Add walker configuration fields to 3DSTATE_WM
intel/genxml/gen125: Tune 3DSTATE_WM Walker direction
intel: Add intel_memory_class_instance
anv: Convert drm_i915_gem_memory_class_instance to intel_memory_class_instance
anv: Use DRM_IOCTL_I915_GEM_CREATE_EXT in all supported kernels
anv: Add basic KMD backend infrastructure
anv: Start to move anv_gem_stubs.c to kmd backend
anv: Remove remaining bits of anv_i915_query()
hasvk: Remove remaining bits of anv_i915_query()
anv: Add gem_close to kmd backend
anv: Add gem_mmap to kmd backend
anv: Move execute_simple_batch() and queue_exec_locked() to kmd backend
intel/common: Move i915 files to i915 folder
iris: Export batch debug functions
iris: Export update_batch_syncobjs()
iris: Export num_fences()
intel: Make gen12 URB space reservation dependent on compute engine presence
intel/blorp: Allocate only necessary amount of VERTEX_BUFFER_STATE
intel: Pull in xe_drm.h
intel: Add Meson parameter to enable Xe KMD support
intel/dev: Add INTEL_KMD_TYPE_XE
intel/dev: Implement Xe functions to fill intel_device_info
intel/dev: Implement Xe functions to handle hwconfig
intel/dev: Query and compute hardware topology for Xe
iris: Convert drm_i915_gem_memory_class_instance to intel_memory_class_instance
iris/bufmgr: Add i915_gem_set_domain()
iris: Use DRM_IOCTL_I915_GEM_CREATE_EXT in all supported kernels
iris: Add initial skeleton of kmd backend
iris: Move iris_bo_madvise() to i915/iris_bufmgr.c
iris: Add iris_bo_set_caching()
intel/common: Implement the Xe functions for intel_engine
intel/common: Implement the Xe functions for intel_gem
iris: Move bo_madvise to kmd backend
iris: Move bo_set_caching to kmd backend
iris: Move iris_bo_busy_gem() to i915/iris_bufmgr.c
iris: Move iris_bo_wait_gem() to i915/iris_bufmgr.c
iris: Don’t mark protected bo as reusable
intel/perf: Disable it for Xe KMD
build: Block build of HASVK, Crocus and i915 in non-x86 architectures
iris: Add gem_mmap() to kmd backend
iris: Add batch_check_for_reset() to kmd backend
iris: Move i915 submit_batch() to i915 backend
anv: Implement gem_create for Xe backend
anv: Implement Xe functions to create and destroy VM
anv: Implement gem close and mmap for Xe backend
anv: Add gem VM bind and unbind to backend
anv: Integrate gem vm bind and unbind kmd backend functions
iris: Drop I915_EXEC_FENCE types
iris: Drop usage of i915 EXEC_OBJECT_WRITE
iris: Move iris_bufmgr_init_global_vm() to i915/iris_bufmgr.c and prepare for Xe KMD
anv: Implement Xe version of anv_physical_device_get_parameters()
anv: Properly alloc buffers that will be promoted to framebuffer in Xe KMD
anv: Handle external objects allocation in Xe
iris: Only mark buffer as exported if drmPrimeHandleToFD() succeed
iris: Implement the Xe version of iris_bufmgr_init_global_vm()
iris: Implement the function to destroy VM in Xe
iris: Implement gem_create() in Xe kmd backend
iris: Implement gem_mmap() in Xe kmd backend
iris: Store iris_context’s priority
iris: Move to i915/iris_batch.c code to create and replace i915 context
iris: Move to iris_i915_batch.c code to destroy i915 context
intel: Move memory aligment information to intel_device_info
anv: Use intel_device_info memory alignment
intel: Set mem_alignment in Xe kmd
anv: Apply memory alignment requirements in Xe kmd
intel: Add TODO about removal of 2Mb alignment in i915
anv: Replace I915_ENGINE_CLASS_VIDEO by INTEL_ENGINE_CLASS_VIDEO
anv: Create Xe engines
anv: Implement Xe version of check_status()
anv: Handle Xe queue/engine priority
anv: Implement Xe version of execute_simple_batch()
iris: Prepare iris_bufmgr functions for vm bind error paths
iris: Add vm bind and unbind to kmd backend
iris: Implement gem_vm_bind() and gem_vm_unbind() in Xe kmd backend
iris: Ajust gem buffer allocation size in Xe kmd
intel: Sync xe_drm.h
anv: Partialy import drm-uapi/gpu_scheduler.h and use it
anv: Fetch max_context_priority from drm_xe_query_config
intel: Allocate mesh shader URB space before task shader
anv: Move to a function code to clflush batch buffers
anv: Implement Xe version of anv_queue_exec_locked() and queue_exec_trace()
anv: Disable anv_bo_sync_type for Xe kmd
anv: Add assert in functions not supported by Xe kmd
iris: Add BO_ALLOC_SHARED
iris: Handle allocation of exported buffers in Xe kmd
iris: Handle allocation of scanout buffers in Xe
iris: Implement Xe version of bo_madvise() and bo_set_caching()
anv: Fix vm bind of imported buffers
iris: Add function to close gem bos
iris: Handle Xe syncronization with syncobjs
loader: Add Xe KMD support
iris: Fix close of exported bos
iris: Allow shared scanout buffer to be placed in smem as well
Juan A. Suarez Romero (8):
v3d/v3dv: define performance counters in common
v3d: cache pipe query results
v3d: include offset as part of streamout target
v3d: implement NV_conditional_render extension
v3d: fix condition for EZ disabling when stencil on
v3d: set depth compare function correctly
v3d: use primitive type to get stream output offset
v3d: apply 1D texture miplevel alignment in arrays
Julia Tatz (3):
zink: zink_heap isn’t 1-to-1 with memoryTypeIndex
zink: trival renames heap_idx -> memoryTypeIndex
zink: correct sparse bo mem_type_idx placement
Juston Li (29):
venus: refactor out vn_feedback_event_cmd_record2
venus: refactor VK_KHR_synchronization2 ext sync fd requirements
venus: require importable external semaphores for WSI
venus: require exportable bit for ext fence sync fd
venus: require exportable/importable bit for ext semaphores sync fd
venus: remove filtering external semaphores for QueueSubmit
venus: drop VkQueueBindSparse
venus: append fence feedback batch
venus: refactor QueueSubmit/QueueSubmit2
venus: vn_queue: align vulkan object variable naming
docs/envvars: add missing mesa disk cache envvars
util/fossilize_db: don’t destroy foz on RO load fail
utils/fossilize_db: refactor out loading RO foz dbs
util/fossilize_db: add runtime RO foz db loading via FOZ_DBS_DYNAMIC_LIST
util/fossilize_db: fix macOS inotify build error
util/fossilize_db: add ifdef for inotify header
util/tests/cache_test: Skip Cache.List if not supported
anv: check initial cmd_buffer is chainable
venus: refactor semaphore helper functions
venus: refactor batch submission fixup
venus: add NO_TIMELINE_SEM_FEEDBACK perf option
venus: add timeline semaphore feedback cmds
venus: enable timeline semaphore feedback
venus: add SHADER_DEVICE_ADDRESS_BIT to buffer cache
venus: switch to lazy VkBuffer cache
venus: add VN_DEBUG_CACHE flag
venus: Add VkBuffer cache statistics for debug
venus: shader cache fossilize replay fix
util/disk_cache: use posix_fallocate() for index files
Kai Wasserbäch (5):
fix: gallivm: limit usage of LLVMContextSetOpaquePointers() to LLVM 15
fix(FTBFS): gallivm: fix LLVM #include of Triple.h, moved to TargetParser
fix(FTBFS): clover: fix LLVM #include of Triple.h, moved to TargetParser
fix: clover/llvm: replace llvm::None with std::nullopt for LLVM 17+
fix: gallivm: fix LLVM #include of Host.h, moved to TargetParser
Kai-Heng Feng (2):
iris: Retry DRM_IOCTL_I915_GEM_EXECBUFFER2 on ENOMEM
Revert “iris: Avoid abort() if kernel can’t allocate memory”
Karmjit Mahil (28):
pvr: Process set and reset event sub commands.
pvr: Process wait event sub command.
pvr: Add SPM scratch buffer infrastructure.
pvr: Acquire scratch buffer on framebuffer creation.
pvr: Update comment about ZS and MSAA buffers for pvrsrvkm submission.
pvr: Set SPMSCRATCHBUFFER flag.
pvr: Add SPM load usc empty programs
pvr: Upload spm load programs to device.
pvr: Add support for VK_ATTACHMENT_LOAD_OP_LOAD.
pvr: Move descriptor write into pvr_write_descriptor_set()
pvr: Add support to copy descriptors on vkUpdateDescriptorSets()
pvr: Handle VK_QUERY_RESULT_WAIT_BIT.
pvr: Store enum pvr_stage_allocation instead of VkShaderStageFlags
pvr: Put old descriptor set approach behind a hardcoding check
pvr: Change last_DMA to last_dma
pvr: Write descriptor set addrs table dev addr into shareds
pvr: Add PVR_SELECT() helper macro
pvr: Add push consts support to descriptor program.
pvr: Add support for dynamic buffers descriptors
pvr: Add support for blend constants.
pvr: Move PBE START_POS into csb enum helpers header
pvr: Setup SPM EOT state
pvr: Remove unused msaa_mode field
pvr: Remove component_alignment
pvr: Setup SPM background object
pvr: Don’t advertise currently unsupported features
pvr: Advertise STORAGE_IMAGE_BIT for B10G11R11_UFLOAT_PACK32
pvr: Don’t advertise S8_UINT support
Karol Herbst (44):
rusticl: fix build error with valgrind being enabled
rusticl/util: extract offset_of macro
rusticl/icd: Make it work in case Rustc shuffles struct around
rusticl/kernel: fix clGetKernelInfo CL_KERNEL_ATTRIBUTES for non source programs
rusticl/program: enable spirv
llvmpipe/ci: increase deqp-runner timeout
rusticl/device: fix some device limits
rusticl/device: limit CL_DEVICE_MAX_CONSTANT_ARGS
rusticl: no compute only
rusticl: allocate printf buffer as staging
nir: Skip samplers and textures in lower_explicit_io
nir/deref: don’t replace casts with deref_struct if we’d lose the stride
ci/zink: move threading tests to flakes
rusticl/kernel: Images arg sizes also have to match the host pointer size
gallivm: fix lp_vec_add_offset_ptr for 32 bit builds
nvc0: enable fp helper invocation memory loads on Turing+
nir: track existence of variable shared memory
rusticl/kernel: set has_variable_shared_mem on the nir
gallium: add get_compute_state_info
lp: implement get_compute_state_info
iris: implement get_compute_state_info
nv50: implement get_compute_state_info
nvc0: implement get_compute_state_info
panfrost: move max_thread_count and take reg_count into account
panfrost: implement get_compute_state_info
rusticl/kernel: make use of cso info
radeonsi: implement get_compute_state_info
radeonsi: use default float mode for CL
rusticl: enable radeonsi
rusticl: split platform into core and api parts
rusticl/platform: rename _cl_platform_id to Platform
rusticl/platform: move getter into the type
rusticl/platform: move device initialization to the platform
rusticl/program: allow dumping compilation logs through RUSTICL_DEBUG
rusticl/program: make IL programs look closer to CLC ones
clc: add clc_validate_spirv
rusticl/program: validate the SPIR-V when created from IL
rusticl/program: extract common code of compile and build
rusticl/program: rework source code tracking
rusticl/event: drop work item before updating status
radeonsi: lower mul_high
ac/llvm: support shifts on 16 bit vec2
rusticl: don’t set size_t-is-usize for >=bindgen-0.65
nvc0: do not randomly emit fences.
Kenneth Graunke (37):
intel/blorp: Lower base_workgroup_id to zero
intel/compiler: Move atomic op translation into emit_*_atomic()
intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs
intel/compiler: Add an lsc_op_num_data_values() helper
intel/compiler: Eliminate SHADER_OPCODE_UNTYPED_ATOMIC_FLOAT
intel/compiler: Drop redundant 32-bit expansion for shared float atomics
intel/compiler: Delete fs_visitor::nir_emit_{ssbo,shared}_atomic_float()
intel/compiler: Combine nir_emit_{ssbo,shared}_atomic into one helper
intel/compiler: Delete all the A64 atomic variants for type sizes
intel/compiler: Drop dest checking in atomic code
intel/compiler: Use more symbolic source names in components_read()
anv: Add missing untyped data port flush on PIPELINE_SELECT
iris: Add missing untyped data port flush on PIPELINE_SELECT
loader: Add infrastructure for tracking active CRTC resources
egl: Rewrite eglGetMscRateANGLE to avoid probes and handle multi-monitor
iris: Perform load_constant address math in 32-bit rather than 64-bit
anv: Perform load_constant address math in 32-bit rather than 64-bit
anv: Make a batch decoder for each queue family
nir: Print divergence information for registers as well as SSA defs
nir: Fix merge_set_dump() to compile again
nir: Fix typos in the from-SSA pass comments
intel: Use common helpers for TCS passthrough shaders
intel/fs: Fix inferred_sync_pipe for F16TO32 opcodes
intel/fs: Add builder helpers for F32TO16/F16TO32 that work on Gfx7.x
intel/fs: Delete a TODO about using brw_F32TO16.
intel/fs: Use new F16TO32 helpers for unpack_half_split_* opcodes
Revert “intel/fs: Fix inferred_sync_pipe for F16TO32 opcodes”
intel/fs: Use F32TO16/F16TO32 helpers in fquantize16 handling
intel/fs: Move packHalf2x16 handling to lower_pack()
intel/eu: Simplify brw_F32TO16 and brw_F16TO32
intel/vec4: Retype texture/sampler indexes to UD
intel/fs: Make bld.F16TO32 actually emit F16TO32 not F32TO16
i965/vec4: Implement uclz in the vec4 backend
st/mesa, iris: Add optional CPU-based ASTC void extent denorm flushing
intel/compiler: Use nir_dest_bit_size() for ballot bit size check
iris: Extend resource creation helpers to allow for explicit strides
iris: Hack around gbm_gralloc stride restrictions
Konrad Dybcio (3):
freedreno/registers: Add RBBM_GPR0_CNTL for non-GMU operation
freedreno: Add A2xx perf counter reg values
freedreno: Add A2xx REG_A2XX_RBBM_PM_OVERRIDE2 bitfields
Konstantin Seurer (78):
vulkan: Track the nullDescriptor feature
radv: Add a helper for finding memory indices
radv: Create a null TLAS as meta state
radv: Use the null accel struct instead of emitting 0
radv/rt: Get rid of accel struct null checks
radv: Advertise rt pipelines for Control (DX12)
radv/bvh/meson: Add the option to set defines
radv/bvh: Add a define for extended SAH
radv: Add a shader variant for PLOC with extended SAH
radv: Wrap internal build type inside a build_config struct
radv: Enable extended SAH for shallow BVHs
radv: Merge the leaf and internal converter
radv: Improve the BVH size estimation
radv: Fix creating accel structs with unbound buffers
radv: Work around shader_call_data variables in raygen shaders
radv/rq: Use 16 stack entries if there is only one ray query
radv/llvm: Use the shader names as module name
ac/llvm: Fix validation error with global io
radv: Scalarize global IO with LLVM enabled
radv: Make radv_compute_dispatch non-static
radv: Implement ordered compute dispatches
radv: Use an ordered dispatch for BVH encoding
radv: Remove radv_indirect_unaligned_dispatch
vulkan,nir: Refactor ycbcr conversion state into a struct
radv: Use common ycbcr conversion lowering
radv/rra: Find copy memory index when initializing the trace state
radv/rra: Hide deferred accel struct data destruction behind an env var
radv: Hash VK_PIPELINE_CREATE_RAY_TRACING_NO_NULL_* flags
radv: Clean up dynamic RT stack allocation
ac/llvm: Implement bvh64_intersect_ray_amd
radv: Make accel struct meta state initialization thread safe
radv: Force ACO for BVH build shaders
radv: Pre-compile BVH build shaders if there is a cache
radv: Advertise ray query support with LLVM
radv/rt: Skip instances after loading the entire node
vulkan: Add vk_acceleration_structure
radv: Use vk_acceleration_structure
anv: Use vk_acceleration_structure
radv/bvh/encode: Use the node type for identifying internal nodes
radv/bvh: Replace is_final_tree with bvh_offset
radv/bvh/encode: Move bvh_offset NULL check to the top of the loop
radv/bvh/encode: Introduce is_root_node
radv/bvh/encoder: Move dst_node initialization into the loop
radv: Add a build config for compact builds
radv/bvh: Implement compact encoding
radv: Use compact encoding
radv: Move the geometry infos before the BVH
radv/bvh: Move the size header field up
radv/bvh: Add a shader for filling the header
radv: Use indirect header filling for compact builds
nir: Add cull_mask_and_flags_amd intrinsic
radv/rt: Merge cull_mask and flags
radv/rt: Pre shift cull_mask
radv: Move header and geometry info init into separate functions
radv: Only init geometry infos if RRA is enabled
radv/rt: Use ushr for extracting the cull mask
radv/rt: Fix updating stack_size if the shader uses scratch
radv/rt: Use vk_pipeline_hash_shader_stage for RT stages
vulkan: Add vk_shader_module_init
radv/rt: Properly handle pNext of pipeline library stages
radv/sqtt: Skip dumping pipeline libraries
radv: Fix loading stack_size from the cache
radv: Fix inserting stack_size into the cache
radv/rt: Handle load_constant instructions when inlining shaders
nir/lower_shader_calls: Remat derefs before lowering resumes
radv/rt: Refactor rq_load lowering
radv/rq: Rematerialize inv_dir before proceed
radv: Set user SGPR locations when declaring args
radv: Stop counting user SGPRS separately
radv/ci: Update ray tracing pipeline fail/skip lists
radv: Add radv_shader_type to fix gs_copy and trap handler handling
radv: Remove some dead radv_shader_args setup
aco: Remove is_gs_copy_shader
radv: Remove has_previous_stage
radv: Pack and encode geometry id and flags on the CPU
radv/bvh: Remove calculate_node_bounds
radv: Remove radv_bvh_aabb_node::aabb
nir/lower_fp16_casts: Fix SSA dominance
Kurt Kartaltepe (1):
drirc: Set limit_trig_input_range option for Nier games
Lang Yu (1):
amd/common: fix a typo
Lepton Wu (1):
egl/dri2: Use primary device in EGL device platform for kms_swrast
Lina Versace (8):
util/glsl2spirv: Fix build with Python 3.6
docs: Lower Python requirement to 3.6
mailmap: Lina is Chad’s new name
mailmap: Add Lina’s new google.com address
venus: Update protocol for VK_EXT_memory_budget
venus: Delete vn_renderer_info::has_cache_management
venus: Refactor vn_physical_device_init_memory_properties
venus: Enable VK_EXT_memory_budget
Lionel Landwerlin (151):
pps: print out message when we get the first counters
anv: record secondaries’ traces into primaries
intel/ds: track secondary cmdbuffers in perfetto
intel/ds: move event_id access to perfetto lambda
util/u_trace: add support for variable length trace points
vulkan/debug_utils: copy debug util labels
anv: add support for command buffer tagging in traces
intel/ds: add INTEL_GPU_TRACEPOINT envvar to toggle tracepoints
intel/ds: remove unused trace point
intel/utrace: document tracepoints
Revert “ci: build hasvk if we’re building anv”
intel: use a shared UUID with other drivers
nir/divergence: add missing RT intrinsinc handling
anv: fix generated indirect draw shader stats checks
nir/lower_io: fix bounds checking for 64bit_bounded_global
anv: fix preemption enable emission in gpu_memcpy
intel/fs: avoid cmod optimization on instruction with different write_mask
intel/decoder: print out compute push constants
intel/common: add a INTEL_DECODE variable to parameter decoder at runtime
vulkan/wsi/wayland: improve same gpu detection
intel/fs: drop FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GFX7
anv: fix null descriptors
docs: list anv in EXT_extended_dynamic_state3 support
intel: add missing PS restriction on BDW+
anv: expose EXT_load_store_op_none
intel/fs: make alpha_to_coverage a tristate
anv: rename RT pipeline function helper
anv: rename a few internal functions to highlight gfx use
intel/ds: track end of pipe bits
anv: use malloc for host only descriptor sets
anv: put the video extension behind a variable
intel/fs: fix mesh indirect movs
intel/dev: add a default urb value for intel_stub_gpu on dg2
anv/hasvk: handle a SAMPLED_READ/STORAGE_READ access flags
anv: remove copied information from runtime graphics state
anv: move 3DSTATE_VERTEX_ELEMENT emission to dynamic path
anv: implement VK_EXT_vertex_input_dynamic_state
intel/fs: bound subgroup invocation read to dispatch size
hasvk: fix KHR_shader_float_controls reporting
intel/perf: also add the oa timestamp shift on MTL
anv: move debug submit to helper and call it on execbuf failure
anv: track vram only BOs to print things out on ENOMEM execbuf
anv: fix vma heap memory leak
anv: fix invalid masking of 48bit address
anv: remove assert typed write support when using NULL surface
vulkan/runtime: store parameters of VK_EXT_sliced_view_of_3d
anv: fixup Wa_16011107343 for Gfx12 only
iris: fix Wa_16011107343 for Gfx12
anv: remove more Gfx7 code
genxml: Fix STATE_BASE_ADDRESS::BindlessSurfaceStateSize field size
genxml: fix border color offset field on Gfx12+
anv/hasvk: speed up null image/view descriptor writes
anv: fix scratch buffer reloc in 3DSTATE_HS
anv: fixup condition for Wa_14016118574
anv: pull Wa_14016118574 out of some loop not changing state
util/glsl2spirv: add support for include directive
anv: fix incorrect parameter
anv: correctly reset generation address on command buffer reset
anv: fix generated forward jump with more than 67M draws
anv: remove copied code from generation shader
anv: remove BTI related flush in generation shaders
anv: correctly program 3DSTATE_SF in generation shaders
anv: limit push constant dirtyness with generation shaders
anv: remove pre hasvk split assert
anv: remove commented code
anv: fix 3DSTATE_PS emission in generation shaders
anv: fix indirect draws VF cache tracking of index buffer
anv: make sure mi_memcpy lands before push constant loads
anv: remove MI_NOOPs at the end of the generation batch
anv: use a single generation shader for indirect draws
anv: rename generated draws for Gfx11
anv: use 64bit int support in generation shaders
anv: pack more data into generated draws input
anv: move common shader code into header
anv: use a list of generated shaders
anv: remove unused item_count parameter
anv: add gfx9 generated draw support
blorp: add dependency on idep_intel_dev
vulkan/runtime: only consider slice info with 3D image views
anv: VK_EXT_image_sliced_view_of_3d
nir: fix nir_ishl_imm
anv: enable VK_EXT_pipeline_library_group_handles
anv/iris: report counter symbols with debug option
intel/fs: report max register pressure in shader stats
anv: report max register pressure in pipeline properties
anv: force MEDIA_INTERFACE_DESCRIPTOR_LOAD reemit after 3D->GPGPU switch
radv: use 1ull for alignment computations
util: allow align64() to do alignments >= 4Gb
docs: fix invalid link
iris: trace frames with u_trace
anv: export EXT_pipeline_library_group_handles only with RT
docs: update Anv features support
anv: more formats for acceleration structure vertices
intel/fs: don’t SEND messages as partial writes
intel/fs: fix nir_opt_peephole_ffma max vec assumption
intel/fs: fixup sources number from opt_algebraic
intel/fs: add MOV source count validation
intel/fs: prevent large vector ops generated by peephole_ffma
intel/fs: fix subgroup invocation read bounds checking
vulkan/wsi: add a headless swapchain implementation/option
intel/compiler: report max dispatch width statistic
anv: report shader max dispatch width in pipeline props
intel/devinfo: add an option to pick platform to print
intel/devinfo: printout URB entries
intel/dev: use generated WA helpers for Wa_22012575642
intel/devinfo: dedicated entries for XeHP
intel/devinfo: initialize pci_device_id with from_pci_id()
intel/dev: fold Gfx12 URB entries in Gfx12 HW info
util/u_trace: move needs_cs_param option to tracepoints
vulkan/runtime: also copy strings on queue debug utils
intel/ds: rename frame timeline row to queue
anv: fix incorrect utrace bo release
anv: fixup locking for utrace submission increments
anv: rename anv_utrace_flush_copy in anv_utrace_submit
anv: add utrace support for queue debug utils
anv: implement recommended flush/wait of AUX-TT invalidation
iris: implement recommended flush/wait of AUX-TT invalidation
anv: hash immutable sampler conversion data not pointers
anv: compute the largest GRL kernel scratch size
anv: move queue check helpers to anv_private
anv: take care of maxStorageBufferRange being uint32_t
isl: update max buffer size for SKL+
intel/dev: set a default valid kmd_type
intel/perf: fix OA format selection on MTL
intel/fs: run VGRF compaction just before max live register accounting
intel/fs: don’t consider fixup_nomask_control_flow SENDs predicate
intel/fs: UNDEF fixup_nomask_control_flow temp register
intel/fs: copy instruction sources in logical send lowering
intel/fs: factor out lsc surface descriptor settings
nir: reuse nir_component_mask() where it makes sense
nir: add 2 new intel intrinsics for uniform ssbo/shared loads
intel/fs: optimize uniform SSBO & shared loads
intel/fs: also allow vec8+ vectorization of load_global_const_block_intel
anv: pass steam output as argument for anv_dump_pipe_bits
anv: replace query flush before gpu copy by semaphore wait
anv: fixup streamout write barriers
intel/fs: use nomask for setting cr0 for float controls
anv: exclude performance queries from blorp clears
intel/ds: add a new timeline row for frames
anv: add utrace tracking of frame boundaries
vulkan/runtime: discard unused graphics stages in libraries
intel/vec4: force exec_all on float control instruction
vulkan/overlay: deal with unknown pNext structures
isl: don’t set inconsistent fields for depth when using stencil only
isl: fix a number of errors on storage format support on Gfx9/12.5
anv: rework Wa_14017076903 to only apply with occlusion queries
nir/divergence: add missing load_global_constant_* intrinsics
anv: fix anv_nir_lower_ubo_loads pass
intel/fs: fix per vertex input clamping
intel/compiler: make uses_pos_offset a tri-state
intel/fs: fix scheduling of HALT instructions
Liviu Prodea (1):
meson: Ignore unused variables when assertions are disabled
Lone_Wolf (3):
compiler/clc: Fix embedded clang headers (microsoft-clc) for LLVM 16+
clc: Add clangASTMatchers to fix static llvm build of microsoft-clc with LLVM 16+
clc: Add clang frontendhlsl module to fix build of microsoft-clc with llvm 16+
Luc Ma (2):
xlib: fix glXDestroyContext in Gallium frontends
meson: keep Mako version checking in accord with build msg
Lucas Fryzek (11):
crocus: Add support for `get_screen_fd`
tegra: Add support for `get_screen_fd`
nouveau: Add support for `get_screen_fd`
zink: Add support for `get_screen_fd`
iris: Add support for `get_screen_fd`
i915: Add support for `get_screen_fd`
svga: Add support for `get_screen_fd`
virgl: Add support for `get_screen_fd`
r300/r600/radeon_si: Add support for `get_screen_fd`
d3d12/llvmpipe/softpipe: Add support for `get_screen_fd`
gallium: Modify default path for DMABUF to use DRM
Lucas Stach (7):
etnaviv: don’t drop TS capability on GPUs with MMUv2
etnaviv: drm: fix BO array leaks
etnaviv: free pm queries dynarray on screen destroy
etnaviv: drm: fix check if BO is on a deferred destroy list
etnaviv: fix double scanout import of multiplanar resources
etnaviv: flush VS texture cache when texture data is changed
etnaviv: fix texture barrier implementation
Luigi Santivetti (4):
pvr: fix uses_tile_buffers in clear color attachment
pvr: add support for tile buffer output clear
pvr: add padding bytes when allocating buffer memory
pvr: fix segfault in dEQP-VK.ycbcr.query.*
Luna Nova (3):
device_select_layer: fix inverted strcmp in device_select_find_dri_prime_tag_default (v1)
device_select_layer: apply DRI_PRIME even if default device is > 1 to match opengl behavior
device_select_layer: pick a default device before applying DRI_PRIME
Lynne (1):
aco_validate: allow for wave32 in p_dual_src_export_gfx11
M Henning (1):
nouveau/codegen: Check nir_dest_num_components
Maarten Lankhorst (1):
iris: Place scanout buffers only into lmem for discrete GPUs
Marcin Ślusarz (23):
intel/compiler: fix generation of vec8/vec16 alu instruction
intel/compiler/mesh: handle const data in task & mesh programs
intel/compiler: fine-grained control of dispatch widths
nir: add nir_mod_analysis & its tests
intel/compiler/mesh: optimize indirect writes
intel/compiler/mesh: support longer write messages
intel/compiler/mesh: remove dead code path supporting >4 dword writes
intel/compiler/mesh: use U888X packed index format
anv: bump ANV_MAX_QUEUE_FAMILIES
intel/compiler: replace gl_Layer & gl_ViewportIndex by 0 in fs if ms doesn’t write it
anv: fix how unset gl_Viewport & gl_Layer are handled in mesh case
intel/compiler/mesh: use slice id of task urb handles in mesh shaders
anv: enable task redistribution
intel/compiler/mesh: apply URB payload mask once per program
intel/compiler/mesh: follow the type of offset variable
intel/compiler: remove unused field from fs_thread_payload
anv: halve the push constants space in mesh pipelines
crocus/meson: add back dependency on libintel_dev
anv,hasvk: remove stale comments
anv: call nir_shader_gather_info early
anv: work around for per-prim attributes corruption
intel/compiler: compactify locations of mesh outputs
anv: ignore structure types handled in vk_device_memory_create
Marek Olšák (212):
glthread: fix an upload buffer leak
util: fix util_is_vbo_upload_ratio_too_large
mesa: allow GL_UNSIGNED_INT64_ARB as vertex format for ARB_bindless_texture
glapi: autogenerate function parameters with no space between * and variable
glthread: handle GL_*_ARRAY in glEnable/Disable
glthread: set GL_OUT_OF_MEMORY if we fail to upload indices
glthread: set GL_OUT_OF_MEMORY if we fail to upload vertices
glthread: execute glMultiDrawArrays(draw_count < 0) asynchronously
glthread: change multi_draw_elements_async() to never fail due to large size
glthread: do vertex uploads if an index buffer is present for glDrawElements
mesa: move gl_vertex_format_user definition into glthread.h
glthread: pack and name the type of glthread_vao::Attrib
glthread: make marshal functions for glBegin/End attribs non-static
glthread: remove the vbo_upload_ratio_too_large fallback for glMultiDrawElements
glthread: do vertex uploads if an index buffer is present for MultiDrawElements
glthread: disallow glthread if buffer uploads are unsupported
ac/llvm: run the LLVM sinking pass because LLVM will stop running it
ac/llvm: run the IPSCCP pass
ac/llvm: remove llvm:: now that we use “using namespace llvm”
amd: update amdgpu_drm.h
ac/gpu_info: add PCIe info
radeonsi/ci: update gfx10.3 results
radeonsi/ci: add gfx1100 results
radeonsi: fix RB+ blending with sRGB formats
radeonsi/gfx11: unset SAMPLE_MASK_TRACKER_WATERMARK to fix hangs
amd: split GFX1103 into GFX1103_R1 and GFX1103_R2
amd: fix tile_swizzle on gfx11 - should be shifted by 10 bits, not 8
amd: update SX_BLEND_OPT_EPSILON.MRT0_EPSILON enum definitions
amd: update shadowed register tables for gfx11
amd: improve RB+ blending precision
radeonsi: implement RB+ depth-only rendering for better perf
radeonsi/gfx11: remove the INST_PREF_SIZE workaround
radeonsi/gfx11: add a comment why we use PRIM_GRP_SIZE <= 252
radeonsi/gfx11: adjust ACCUM_* fields for tessellation
radeonsi/gfx11: fix blend->cb_target_mask dependency for shader keys
radeonsi/gfx11: move the PIXEL_PIPE_STAT_CONTROL event into the GFX preambles
radeonsi/gfx11: use new packet EVENT_WRITE_ZPASS
radeonsi: deduplicate VS/TES/GS update code
radeonsi/gfx11: always set MSAA_NUM_SAMPLES=0 for DCC_DECOMPRESS
radeonsi: merge si_ps_key_update_framebuffer_blend & .._update_blend_rasterizer
radeonsi: determine alpha_to_coverage robustly in si_update_framebuffer_blend_rasterizer
radeonsi: never set INTERPOLATE_COMP_Z
amd: unify and tune the attribute ring size for gfx11
amd: change pbb_max_alloc_count for gfx11
amd: update the cache size for gfx1103_r1
amd: update late_alloc_wave64 for gfx11
amd: sort and re-indent packet definitions
amd: fix typo in shadowed uconfig registers on gfx11
amd: document OOB behavior on gfx11
amd/registers: remove confusing definitions from gfx10-rsrc.json
radeonsi: set NEVER as the depth compare func if depth compare is disabled
amd/llvm: fix LLVM 15 & 16 crashes in SelectionDAG.cpp
radeonsi: call ac_init_llvm_once before any util_queue initialization
radeonsi: set sampler COMPAT_MODE in the corresponding branch
amd/ci: update sanctuary trace sha1
radeonsi/gfx11: don’t add mrt0 export for alpha-to-coverage if mrtz is present
radeonsi/gfx11: don’t add alpha to mrt0 format for A2C if exporting via mrtz
amd: define new SET_*_REG_PAIRS packets
radeonsi: clean up si_set_mutable_tex_desc_fields
amd/surface: clean up is_dcc_supported_by_L2
amd,util: fix how lod bias is converted to fixed-point
amd: don’t hardcode real VGPR allocation granularity on gfx10.3 and gfx11
glthread: track the current element array buffer in the Core profile too
mesa: ignore indices[i] if count[i] == 0 for MultiDrawElements
glthread: initialize indices[i] for no-op MultiDrawElements
glthread: upload non-BO indices in the core profile to fix GStreamer
glthread: add a heuristic to stop locking global mutexes with multiple contexts
glthread: ignore non-VBO vertex arrays with NULL data pointers
Revert “ci/zink: Disable Amnesia trace until the linked issue gets fixed.”
glthread: rewrite glMultiDrawArrays to never fail to upload vertices
glthread: change glMultiDrawElements to execute draw_count < 0 asynchronously
glthread: don’t execute glDraw code if we’re inside glBegin/End
glthread: don’t pass index bounds to the driver for async calls
glthread: move some draw call parameters closer to their use
glthread: don’t bind/unbind uploaded indexbuf, pass it to glDraw directly
glthread: don’t bind/unbind uploaded indexbuf, pass it to glMultiDraw directly
glthread: track vertex formats for all attributes
glthread: add a vertex upload path that unrolls indices for glDrawElements
glthread: reorder draw code a little
glthread: add ctx->GLThread.draw_always_async to simplify draw checking
glthread: remove goto statements and add unlikely() into draw functions
glthread: inline draw functions that have only one use
glthread: don’t execute Draw and BufferSubData calls if the context is lost
glthread: handle non-VBO uploads for glMultiModeDraw{Arrays,Elements}IBM
glthread: add API to allow passing DrawID from glthread to mesa
glthread: convert (Multi)DrawIndirect into direct if user buffers are present
glthread: remove unnecessary debug code
glthread: don’t free glthread for GL_DEBUG_OUTPUT_SYNCHRONOUS, only disable it
glthread: don’t restore non-VBO vertex arrays after all draws
Revert “radeonsi/ci: Update stoney test expectations”
radeonsi: fix COMPAT_MODE on gfx8-9
amd: fix LOD_BIAS on gfx6-9 and adjust the lod bias CAP
amd: add missing gfx11 register definitions
amd: bump AMD_MAX_SE and change the CU mask type to 16 bits
radeonsi/gfx11: fix the CU_EN clear mask for RSRC4_GS
radeonsi/gfx11: don’t set non-existent VGT_STRMOUT_BUFFER_CONFIG
radeonsi/gfx11: set CB_COLORi_INFO.MAX_COMP_FRAG on GFX1103_R2
radeonsi: move a few DB_SHADER_CONTROL states into si_shader_ps
radeonsi: change si_shader::ctx_reg to a nameless union for better readability
radeonsi: remove no-op setting of THDS_PER_SUBGRP
radeonsi: use SPI_SHADER_USER_DATA_HS_0 definition instead of LS_0
radeonsi: set PA_SU_VTX_CNTL consecutively with PA_CL_GB_VERT_CLIP_ADJ
radeonsi/gfx11: ignore alpha_is_on_msb because the hw ignores it
radeonsi: replace si_screen::has_out_of_order_rast with the radeon_info field
radeonsi: disable Smart Access Memory because CPU access has large overhead
amd,radeonsi: remove unused LLVM functions
amd/registers: unify VRS combiner definition names between gfx103 and gfx11
amd: replace SI_BIG_ENDIAN with UTIL_ARCH_BIG_ENDIAN
radeonsi: remove returns from si_emit_global_shader_pointers
radeonsi: reformat emit_cb_render_state, create_blend_state, create_rs_state
radeonsi: remove a gfx11 check in si_shader_gs (legacy GS)
radeonsi: remove unused VS_STATE_LS_OUT_PATCH_SIZE
radeonsi: always add 1 to lshs_vertex_stride now that LS_OUT_PATCH_SIZE is gone
radeonsi: correct and clean up obsolete vs_state_bits comments
radeonsi: rename esgs_itemsize -> esgs_vertex_stride
amd: query the per-SIMD VGPR counts from the kernel, don’t hardcode them
radeonsi: don’t clamp z_samples to fix Unreal Tournament 99
amd/registers: only define SPI and COMPUTE registers in the 0xB000 range
radeonsi: reorganize emit_db_render_state and simplify VRS code
radeonsi: reorganize si_initialize_color_surface for better readability
radeonsi: reorganize si_init_depth_surface for better readability
radeonsi: don’t set PACKET_TO_ONE_PA for line stippling
radeonsi/gfx11: change the default of COMPUTE_DISPATCH_INTERLEAVE to 256
amd: implement conformant TRUNC_COORD behavior for gfx11
amd/gpu_info: add a workaround for SI_FORCE_FAMILY=gfx1100
nir,amd: add and use nir_intrinsic_load_esgs_vertex_stride_amd
nir: lower to fragment_mask_fetch/load_amd with EQAA correctly
glthread: fix a perf regression due to draw_always_async flag, fix DrawIndirect
mesa: fix glPopClientAttrib with fixed-func VP and zero-stride varyings
mesa: remove a redundant call to _mesa_update_edgeflag_state_vao
mesa: initialize VertexProgram._VaryingInputs before the first use
amd: update amdgpu_drm.h
amd,radeonsi: change enabled_rb_mask to 64 bits
amd: query cache sizes from the kernel
ac/nir: don’t use load_esgs_vertex_stride_amd on gfx6-8
amd: massively simplify how info->spi_cu_en is applied
amd/rtld: allow 64K LDS for all shader stages except for gfx6
radeonsi/ci: update flakes and gfx8-polaris11 results
radeonsi: remove Smart Access Memory because CPU access has large overhead
radeonsi: reorganize si_emit_framebuffer_state for better readability
radeonsi: don’t merge SET_* packets that have a different index in si_pm4_state
radeonsi: reindent code in si_state_binning.c
radeonsi: add si_pm4_set_reg_va to simplify setting reg_va_low_idx for RGP
radeonsi: check the pm4.reg_va_low_idx assertion unconditionally
radeonsi: simplify encoding VGPRS and SGPRS
radeonsi: assume shader is never NULL in si_emit_shader_*
nir: return progress from nir_lower_io_to_scalar
nir: skip nir_op_unpack_32_4x8 in nir_lower_alu_width
ac/nir: add ac_nir_lower_subdword_loads to lower 8/16-bit loads to 32 bits
aco: implement nir_op_unpack_32_4x8
ac/llvm: implement nir_op_unpack_32_4x8
amd: lower subdword UBO loads in NIR
amd: lower multi-component subdword SSBO loads in NIR
lavapipe/ci: add a new flake
amd: add nir_intrinsic_xfb_counter_sub_amd and fix overflowed streamout offsets
amd/llvm,radeonsi/gfx11: switch to using GDS_STRMOUT registers
radeonsi/gfx11: only allocate GDS OA for streamout, GDS memory is not needed
radeonsi: emulate VGT_ESGS_RING_ITEMSIZE in the shader on gfx9-11
radeonsi: merge si_emit_initial_compute_regs with si_init_cs_preamble_state
radeonsi: separate nir_texop_descriptor_amd lowering
radeonsi: lower nir_texop_sampler_descriptor_amd
radeonsi: set pm4.atom.emit in si_get_shader_pm4_state
radeonsi: reindent si_shader_ls, si_shader_es, si_shader_gs, si_shader_vs
radeonsi: reorganize si_shader_hs
radeonsi: reorganize si_shader_ngg
radeonsi: reorganize si_shader_ps
radeonsi: other cosmetic changes in si_state_shaders.cpp
radeonsi: allow using 64K LDS for NGG to allow larger workgroups
radeonsi: increase NGG workgroup size to 256 for VS/TES with streamout and GS
glapi: move files specific to shared-glapi into the shared-glapi subdirectory
glapi: inline the meson list files_mapi_util
mesa: move ctx->Table -> ctx->Dispatch.Table except Client & MarshalExec
mesa: rename CurrentClientDispatch to GLApi
mesa: put dispatch table initialization into one place
glthread: qualify the *cmd unmarshal parameter with restrict
vbo: fix current attribs not updating gallium vertex elements
radeonsi: remove unused TCS/TES SGPR fields
radeonsi: dump shader stats only if dumping asm shaders
radeonsi: replace nonir,noir,noasm,preoptir options with new reworked options
radeonsi: remove duplicated gfx11 check in si_msaa_resolve_blit_via_CB
radeonsi: rework MSAA resolve averaging to exploit instruction-level parallelism
radeonsi: add AMD_DEBUG=nowcstream to enable caching for stream_uploader
radeonsi: don’t print the base non-view texture format for AMD_TEST=computeblit
radeonsi: fix AMD_TEST=computeblit being rejected on gfx < 11
radeonsi: don’t convert to fp16 in the compute blit if not testing
radeonsi: don’t use fp16_rtz for FP formats in the compute blit
radeonsi: correct an assertion if we get a display list with no vertex buffers
ac/nir: don’t emit duplicated parameter exports
ac/nir: use plural correctly in the ac_nir_export_parameters name
radeonsi: remove unused vs_output_param_mask
egl: reorder code in _eglQueryDevicesEXT, add *swrast variable
egl: don’t expose swrast device if swrast is not built
amd/llvm: fix handling of unsupported vec3 loads on gfx6
amd/llvm: remove no-op code for vec3 loads in ac_build_tbuffer_load
amd: update addrlib
amd: rename GFX1036 -> RAPHAEL_MENDOCINO
amd: set the correct LLVM processor name for gfx1036
radeonsi/gfx11: reduce MSAA samples to 8 for no-attachment framebuffer
radeonsi: simplify binning settings to work around GPU hangs
amd: add gfx940 register definitions
amd: add initial code for gfx940
radeonsi: use COMPUTE_DISPATCH_SCRATCH_BASE on gfx940
radeonsi: always use ffma32 on gfx940
ac/surface: force linear image layout for chips not supporting image opcodes
radeonsi: add an emulated image descriptor for gfx940
ac/nir: implement image opcode emulation for CDNA, enable it in radeonsi
radeonsi: don’t set registers that don’t exist on gfx940
amd/registers: simplify integer division by 0x1000 in the parser
amd/registers: fix the parser to include CP_COHER registers for gfx940
amd/registers: update gfx940.json
amd/registers: use gfx9 packet definitions for gfx940
nir: fix 2 bugs in nir_create_passthrough_tcs
Mario Kleiner (1):
v3dv: Enable (leased) direct display extensions.
Mark Collins (4):
meson: update flex/bison requirement to cover all usages
meson: forcefully disable libdrm when host doesn’t have it
tu: KGSL backend rewrite
tu: fix tu_GetInstanceProcAddr not handling null instance
Mark Janes (13):
intel: Implement Wa_16011448509
util: add macro to support gcc/clang poison
intel/dev: generate helpers to identify platform workarounds
intel/dev: Print required workarounds with intel_dev_info
intel/fs: use generated workaround helpers for Wa_14010017096
intel/fs: use generated helpers for Wa_1209978020 / Wa_18012201914
intel/fs: use generated workaround helpers for Wa_14017989577
intel: use generated workaround helpers for Wa_1409600907
intel: use generated helpers for Wa_1409433168/Wa_16011107343
intel/fs: use generated helpers for Wa_14013363432 / Wa_14012688258
intel/dev: fix macro string concatenation for INTEL_WA_{id}_GFX_VER
intel/dev: fix macro naming convention in gen_wa_helpers.py
intel/dev: use GFX_VERx10 to detect genX compilation
Martin Roukala (né Peres) (22):
ci/deqp-runner: compress results.csv before uploading it to GitLab
ci/piglit: compress results.csv before uploading it to GitLab
zink/ci/radv: remove a test from the fails list
zink/ci: add a fail to the VG flake list
zink/ci: relocate radv testing from radv’s gitlab-ci.yml
zink/ci: add spec@!opengl 1.1@line-smooth-stipple to the fails list
ci/b2c: uprev to b2c v0.9.9
ci/debian/x86_test-vk: drop an outdated dependency
ci/core-manual-rules: enclose the whole condition in quotes
zink/ci: allow running manual jobs again on RADV
ci/init-stage2: allow sourcing the job env vars from the CWD
ci/init-stage2: always set XDG_RUNTIME_DIR
ci/b2c: move away from the hand-rolled initscript
ci: bring back the valve farm online
ci/valve-farm-rules: allow running jobs from outside the mesa namespace
radv/ci: reduce the parallelism for vkcts-vangogh
zink/ci: increase the parallelism of zink-radv-vangogh-valve
zink/ci: update the radv expectations
radv/ci: update VanGogh’s expectations
ci/b2c: increase the console timeout to 4 minutes
radv/ci: update the navi10 expectations
zink/ci: add a test to the fails list
Matt Coster (6):
pvr: Extract setup of winsys job submit flags into separate functions
pvr: Add support for geometry-only render jobs
pvr: Add pvr_csb_bake()
pvr: Rename global_queue_job_count to global_cmd_buffer_submit_count
pvr: Split render job submission for multi-layer framebuffers
pvr: Add firmware stream support for transfer submit
Matthieu Bouron (1):
lavapipe: honor dst base array layer when resolving color attachments
Mauro Rossi (1):
hasvk: include “vk_android.h” header in anv_android.c
Maíra Canal (1):
v3dv: remove unused clamp_to_transparent_black_border property
Michel Dänzer (29):
mesa/st: Fix GL_EXT_texture_type_2_10_10_10_REV name in comment
mesa/st: Handle all 10 bpc types in st_choose_format
glsl/standalone: Fix up _mesa_reference_shader_program_data signature
glsl/standalone: Do not pass memory allocated with ralloc_size to free
anv/grl: Use union for reinterpreting integer as float
clover: Reserve vector memory in make_text_section
ci: Update Fedora image to 36
ci: Re-enable intel-clc in fedora-release job
ci: Enable i915 Gallium driver in fedora-release job
ci: Enable the hasvk Vulkan driver in the fedora-release job
frontend/dri: Initialize callbacks in dri_swrast_kms_init_screen
nouveau: Make getSize return unsigned int
r600: Use container_of instead of direct pointer cast
crocus: Use ralloc_free for memory allocated with rzalloc
iris: Use ralloc_free for memory allocated with rzalloc
ci: Remove some -Werror workarounds for debian-android job
ci: Split up -Werror workarounds for debian-mingw32-x86_64 job
intel/vk/grl: Do not use no_override_init_args for C++
ci: Pass -Werror to compiler linking stage for LTO
ci: Allow passing c{,pp}_link_args to meson
ci: Make ccache optional
ci: Drop ccache from Fedora image
ci: Install procps-ng in Fedora image
ci: Enable LTO for fedora-release job
vulkan: Fix GetPhysicalDeviceSparseImageFormatProperties definitions
svga: Make vmw_svga_winsys_buffer_map definition match declaration
svga: Make declaration of emit_input_declaration match definition
clover/llvm: Use llvm::DataLayout::getABITypeAlign with LLVM >= 16
clover/llvm: Use std::nullopt already with LLVM 16
Michel Zou (4):
ci/mingw: drop useless -Wno-error flags
vulkan/wsi: fix -Wnarrowing warning
vk/entry_points:: fix mingw build
mesa/draw: fix -Wformat warning
Mike Blumenkrantz (536):
zink: simplify get_slot_components() for xfb emission
zink: add renderdoc handling
zink: prune old swapchains on present
zink: break out implicit feedback loop detection into separate function
zink: set textures_used in analyze_io
zink: outdent code in add_implicit_color_feedback_loop()
zink: make implicit feedback loop application stricter
zink: skip implicit feedback loop layout changes if feedback loop not present
zink: store drm format as internal_format for imported resources
zink: handle modifier nplanes queries correctly for planar formats
zink: NV_compute_shader_derivatives
zink: preserve present resources during async presentation
zink: add a util function for creating semaphores
zink: add a binary semaphore cache
zink: move semaphore caching to zink_reset_batch_state()
zink: consolidate semaphore creation where possible
zink: simplify some dynarray concat descriptor code
zink: delete need_blend_constants
zink: don’t use ds3 blend states without color attachments
radv: repack radv_graphics_pipeline struct
radv: reorder dynamic state checks during bind
radv: simplify depth aspect check in radv_handle_image_transition()
radv: add some graphics pipeline hints to optimize pipeline bind
radv: remove redundant type sizing
radv: add an early out in radv_cmd_buffer_flush_dynamic_state()
zink: use actual swapchain object for surface comparison
radv: stop using radv_pipeline_has_stage() in BindPipeline
zink: flag old-style shadow tex mask for fragment shaders
zink: break out tex dest rewriting into separate function
zink: add an extra_data param to zink_shader_compile
zink: track depth swizzle on samplerviews
zink: add a fs shader key member to indicate depth texturing mode
zink: rework depth sampler splatting in shaders
zink: block pipeline fast-pathing for any programs using depth texture modes
zink: plug in the program/module parts of shadow texture mode emulation
zink: create another samplerview for shadow textures
zink: remove old depth swizzle workaround
zink: pass depth swizzle data block to shader compile
mesa: remove dead parameter doc for _mesa_new_texture_object()
mesa: populate gl_program::ShadowSamplers mask from shader data
mesa: (more) correctly handle incomplete depth textures
zink: fix implicit feedback loop detection
radv: Move constant flushing check out to callers.
zink: fix VK_DYNAMIC_STATE_LINE_WIDTH usage
zink: move barrier jit to zink_context.c
zink: don’t skip repeated handling feedback loops
zink: return false for implicit feedback loop check with image binds
zink: update sampler layout when detecting feedback loop for first time
zink: force GENERAL layout for all fb attachments with image binds
zink: validation ci updates
zink: reorder commands more aggressively
Revert “zink: allow direct memory mapping for any COHERENT+CACHED buffer”
zink: fix heap/memory type selection
zink: add VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT for buffers if ext is enabled
zink: set vkusage/vkflags for buffer resource objects
zink: assert that buffer descriptor usage is populated before bind
zink: always set RESOURCE usage for descriptor buffers
zink: disable bindless texture ext with descriptor buffer
zink: set VK_PIPELINE_CREATE_DESCRIPTOR_BUFFER_BIT_EXT when using DB
zink: don’t add dmabuf export type if dmabuf isn’t supported
zink: stop leaking push descriptor templates
zink: always unref old images when adding new binds
zink: hook up VK_EXT_multisampled_render_to_single_sampled
zink: shrink zink_render_pass_state::msaa_expand_mask
zink: use VK_EXT_multisampled_render_to_single_sampled for EXT_multisample_render_to_texture
lavapipe: move noop fs creation to device
lavapipe: add refcounting for shader nir
lavapipe: refcount nir shaders instead of cloning
lavapipe: break out (and slightly refactor) gallium shader cso creation
lavapipe: create gfx gallium csos at pipeline bind
lavapipe: delete unused pipelines immediately
lavapipe: delete lvp_pipeline::mem_ctx
lavapipe: try harder to reuse pipeline layouts during merge
zink: only set VkPipelineColorBlendStateCreateInfo::attachmentCount without full ds3
zink: fix zink_mem_type_idx_from_bits()
zink: rework descriptor buffer templating to use offsets
Revert “zink: fix zink_mem_type_idx_from_bits()”
zink: enable PIPE_CAP_ALLOW_GLTHREAD_BUFFER_SUBDATA_OPT
zink: make bindless buffer_infos a union
zink: fix bindless struct member comments
zink: skip updating descriptor buffer sets that aren’t active
zink: set VK_PIPELINE_CREATE_DESCRIPTOR_BUFFER_BIT_EXT on compute pipelines
zink: break out descriptor binding into separate function
zink: add a flag to indicate whether a descriptor buffer is bound
zink: implement descriptor buffer handling of bindless texture
zink: enable bindless texture with ZINK_DESCRIPTORS=db
zink: free descriptor buffer maps on batch state destroy
zink: fix more cases of heap/memtype suballocator mismatch
zink: cache and reuse dummy inputattachment for fbfetch
zink: handle missing line rasterization modes with ds3
zink: add back VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT for bindless
gallium: add PIPE_CAP_NULL_TEXTURES
radeonsi: set PIPE_CAP_NULL_TEXTURES
zink: conditionally enable PIPE_CAP_NULL_TEXTURES
zink: fix max acquired image count
lavapipe: disable VK_FORMAT_FEATURE_2_COLOR_ATTACHMENT_BLEND_BIT for int formats
zink: set PIPE_CAP_VALIDATE_ALL_DIRTY_STATES
zink: move bindless_layout to screen and init on creation
zink: take screen param in init_db_template_entry()
zink: const-ify a bunch of shader key inlines
zink: move gpl usability checks to static inline for reuse
zink: remove duplicated gpl output blend initializations
zink: store last pipeline directly for zink_gfx_program::last_pipeline
zink: pass screen to descriptor_util_pool_key_get()
zink: delete zink_screen::framebuffer_cache
zink: make last_vertex_stage the first bit in zink_vs_key_base
zink: ralloc zink_shader structs
zink: add a define for the “default” optimal key
zink: add a define for testing that an optimal key is the default
zink: add VK_PIPELINE_CREATE_DESCRIPTOR_BUFFER_BIT_EXT for gpl libs
zink: don’t set blend_id with full_ds3
zink: set gfx feedback loop bit in pipeline state for driver workaround
zink: set zs feedback loop bit from driver workaround on ctx create
zink: fix gpl lib hashing
zink: use screen indexing for bindless descriptor set in db bind
zink: use screen indexing for bindless descriptor set in template bind
util/vbuf: fix multidraw unrolling
zink: flag bindless_init before calling zink_batch_bind_db() in init
zink: avoid the descriptor set multiplier for bindless buffers
zink: split out VkShaderModule creation
zink: add flags param to zink_pipeline_layout_create()
zink: split out gfx pipeline library creation
zink: add gpl flags for libraries based on shaders passed
zink: allow multiple gpl libraries in zink_create_gfx_pipeline_combined()
zink: move gpl input/output funcs to zink_pipeline.c
zink: enable combining intermediate gpl libs from combine function
zink: use GPL to handle (simple) separate shader objects
zink: set PIPE_CAP_SURFACE_REINTERPRET_BLOCKS
zink: store gfx_hash on zink_gfx_program
zink: break out zink_gfx_program::libs into refcounted object
zink: implement cross-program pipeline library sharing
zink: add newlines to some debug printfs
zink: rename some variables in zink_set_shader_images()
zink: unset gfx shader read when unbinding shader images
zink: remove stale comment
zink: unref image buffer descriptors on unbind
zink: rework set_shader_images() hook
zink: pull out image descriptor updating in set_shader_images
zink: add a local is_compute var for set_shader_images
zink: allocate all batch command buffers in one call
zink: sync LTO compiles for GPL pipelines on shader free
zink: fix descriptor pool free iterating
zink: don’t fetch/update pipeline cache for separate shader programs
zink: assert that the found program matches the expected one in shader_free
zink: flag gfx programs as removed-from-cache by default
zink: rework separate shader descriptor iterating
zink: use a single descriptor buffer for all non-bindless types
zink: add an io assignment pass for separate shaders
zink: rename a struct member for clarity
zink: move zink_batch_state::db_bound reset to zink_batch_descriptor_reset()
zink: move db_bound to batch descriptor data
zink: ensure db is bound before separate shader update
zink: store base descriptor size on the screen
zink: implement a scaling descriptor buffer size
zink: calloc separable program zink_gfx_library_key struct
zink: also replace hash_entry::key when replacing separable program
zink: always use NEAREST for zs blits
zink: fix indentation of rebind_image()
zink: only try for a fb rebind if fb binds exist in rebind_image()
zink: account for null surface when trying to retain clears on fb bind
zink: break out pipe_surface init for new surface creation
zink: const-ify a surface param
zink: don’t handle mutable init on surface creation with tc enabled
zink: verify compressed format layer count when creating surfaces
zink: set ZINK_DESCRIPTORS=db for radv jobs
zink: enable renderpass optimizations by default for selected drivers
Revert “zink: always use NEAREST for zs blits”
zink: block LINEAR filtered blits for zs formats
lavapipe: enable linear filtering for depth formats
gallium: plumb resolve attachments through from frontends -> pipe_framebuffer_state
dri3: avoid deadlocking when polling deleted windows for events
vulkan/wsi: avoid deadlocking dri3 when polling deleted windows for events
vl/dri3: avoid deadlocking when polling deleted windows for events
zink: delete some now-broken ntv dref sampling code
zink: more accurately handle i/o for separate shaders
zink: handle semi-matching i/o for separate shaders
zink: enable renderpass optimizing on lavapipe
kopper: fix loop iterating for msaa texture creation
zink: move db input attachment size check to screen init
zink: make ZINK_DESCRIPTOR_MODE=db the default
driconf: add zink glthread disable for a game
aux/tc: add a ‘has_resolve’ member to tc_renderpass_info
zink: actually hook up ZINK_DEBUG=norp
zink: add ZINK_DEBUG=map
zink: add debug marker tracing for qbo updates
util/box: add intersection test functions for 1d/3d
zink: add some tracking for copy box regions
zink: add a util function for optimizing TRANSFER_DST image barriers
zink: utilize copy box tracking to avoid barrier emission for buf2img copies
zink: fix slab allocator sizing
zink: delete dead uniform variables
zink: fix shader read access removal for barrier generation
zink: rework descriptor unbind params to use is_compute directly
zink: fix bindless texture barrier generation
zink: delete unused emit_image param in ntv
zink: simplify/rework image typing in ntv
zink: avoid adding ubo/ssbo bindings multiple times for different bitsizes
zink: add locking for zink_screen::copy_context and defer creation
zink: allow direct memory mapping for any COHERENT+CACHED buffer
lavapipe: EXT_image_sliced_view_of_3d
vulkan/wsi: switch to using an options struct for last param
vulkan/wsi/x11: make 4 image minimum for xwayland driver-specific
docs: add pipeline library support for tu
aux/tc: track whether queries have been terminated in a renderpass
aux/tc: only call tc_parse_draw() when parsing renderpass info
zink: move zink_batch_no_rp call for query reset
zink: remove suspended queries from list before resuming
zink: reset queries on the promoted cmdbuf when possible
zink: rewrite zink_query_start struct
zink: fix zink_query_start initialization
zink: fix possible query destroy leak
zink: make zink_vk_query unref consistent
zink: delete zink_query::last_start_idx
zink: handle multiple query starts in qbo update
zink: don’t auto-sync qbos on query end
zink: add zink_query::suspended to indicate suspended state of queries
zink: un-suspend queries if they end while suspended
zink: don’t double suspend queries
zink: refuse to start cs invocation queries in renderpass
zink: only try doing qbo updates on query suspend if !in_rp
zink: handle null query results
zink: handle null query results for conditional render
zink: only update qbo for TIME_ELAPSED on start if !in_rp
zink: try updating qbos on query resume if !in_rp
zink: reorder some query code
zink: rework find_or_allocate_qp()
zink: create/use query pools dynamically
zink: rework query pool overflow
zink: track whether queries were started in a renderpass
zink: break out query suspend functionality for reuse
zink: use more consistent check for deleting zink_query::stats_list links
zink: pull ‘was_line_loop’ into ctx for query updating
zink: always start/stop/resume queries inside renderpasses
zink: use tc renderpass optimizing to more optimally start queries
zink: skip buffer barriers for ACCESS_NONE -> ACCESS_READ / ACCESS_WRITE
zink: disable queries for clear_texture()
zink: resume queries after conditional render and clears are processed
zink: only resume queries inside renderpasses from set_active_query_state
zink: track whether a primgen query is suspended and needing color write hacks
zink: rework xfb queries for drivers with poor primgen support
zink: merge qbo update copies when possible
zink: set predicate_dirty on query creation
zink: eliminate internal qbo copy barrier
util/box: fix off-by-one calc error in intersection funcs
zink: avoid ballooning of copy box tracking
zink: add perfetto tracing for barriers
zink: avoid unnecessary read-only layout changes for zs attachments
zink: skip attachment barrier for redundant layout-setting if !valid
zink: add functions for faster batch-usage completion checks
zink: fix zink_resource_access_is_write()
zink: use split image barriers if the image can be easily proved idle
zink: skip buffer barriers if the buffer can be easily proved idle
zink: determine whether debug markers are used on screen create
zink: be more descriptive with perfetto buffer barriers
zink: reorder some blit debug markers
zink: pass cmdbuf to debug marker begin
zink: remove debug markers for u_blitter ops
zink: when skipping a TRANSFER_DST image barrier, set access tracking
zink: add frame trace markers on the queue
zink: unbind fb on context destroy
zink: only add deferred barrier on fb unbind when layout needs to change
zink: fix descriptor update flagging on null ssbo set
zink: propagate valid_buffer_range when replacing buffer storage
zink: check for layout updates when unbinding samplerviews
zink: eliminate pre barrier for adding resource binds
zink: don’t unset existing access when adding resource binds
zink: explicitly flush src clears when u_blittering
zink: always set color writes on the unordered cmdbuf
zink: bind descriptor buffers to unordered cmdbuf
zink: set dynamic pcp for unordered cmdbuf
zink: always set batch usage for descriptors after barrier
llvmpipe: fix LP_PERF=no_depth to ignore depth format
zink: track the last write access for resources
zink: add a mechanism for managing TRANSFER_DST buffer barriers
zink: add a mechanism to trigger copy box resets from batch state reset
zink: add a driver workaround to disable copy box optimizations
zink: hook up buffer TRANSFER_DST barrier optimizing
aux/tc: fix rp info resizing clobbering current info
vulkan/wsi: fix crash in failed swapchain creation for wayland
lavapipe: split out spirv compile of shaders
lavapipe: split out shader struct members into their own struct
lavapipe: pass shader struct and layout to scan_pipeline_info()
lavapipe: more small shader struct usage tweaks
lavapipe: move xfb init to shader struct
lavapipe: rename inline uniform function params
lavapipe: move uniform inline functions to shader struct
lavapipe: break out main shader lowering into separate function
Revert “Revert “ci: disable mesa-swrast runner jobs””
zink: ignore renderdoc if ZINK_RENDERDOC isn’t in use
radv: delete radv_graphics_pipeline_compile() asserts
radv: avoid a huge memset in radv_graphics_pipeline_compile()
aux/tc: use renderpass tracking to optimize texture_subdata calls
lavapipe: fix dynamic depth clamping
lavapipe: set render_condition_enabled=false for vkCmdClearDepthStencilImage
lavapipe: add command debugging
lavapipe: beef up LVP_POISON_MEMORY
ci: fix LVP_POISON_MEMORY usage
zink: rework zink_resource::valid_buffer_range
zink: return the unordered state from zink_resource_buffer_transfer_dst_barrier()
zink: unify image TRANSFER_DST barrier checks
zink: rename zink_check_transfer_dst_barrier()
zink: super reorder buffer copies
tu: don’t set startup debug on debug builds
zink: fix copy box iteration
glthread: align small buffer uploads to 4 bytes
zink: fix copy box iteration when adding
zink: fix copy box merging adjacency
aux/trace: delete GALLIUM_TRACE_NIR log message
zink: fix layer check for compressed format surface creation
zink: track current queue for resources
zink: remove redundant dmabuf_acquire setting
zink: use res->queue to auto-handle queue transitions back to gfx queue
zink: track tc fences better
zink: add an assert to catch renderpass optimizing bugs
zink: don’t use/update tc rp info while blitting
zink: reset tc fb info upon splitting a renderpass
zink: add and use a function for “safely” ending renderpasses
zink: disable queries when flushing clears from set_fb
zink: disable tc flush notify with rp optimizing
zink: trigger oom flushes more aggressively from copy ops
aux/tc: flag late zs clears as partial clears
aux/tc: use a local ‘deferred’ variable in tc_flush()
aux/tc: fix renderpass splitting on flush
aux/tc: track the number of active queries
aux/tc: don’t sync for get_sample_position
aux/tc: add a function to reset rp info
aux/tc: fix initial rp info allocation
aux/tc: make some of the rp tracking api private
aux/tc: rework inter-batch renderpass info handling
zink: only flag rp info for updating on flush, don’t actually update
zink: add tracing for copy ops
zink: expand ZINK_DEBUG=sync to cover copy ops
zink: add some asserts for zs layout in dynamic render
zink: double check layouts for possible feedback loop images
zink: end rp earlier in set_framebuffer_state
zink: add a function for applying u_blitter barriers
zink: add a dynamic render version of clear_texture hook
zink: reset fb clears using the clears_enabled mask
zink: manually apply barriers whenever zink_context::blitting is set
zink: split out pipeline rp info update function
zink: add zink_context::unordered_blitting to preserve unordered flags
zink: implement unordered u_blitter calls
zink: round geometry for u_blitter debug markers
zink: catch zs u_blitter ops for draw markers
zink: add debug markers for draws
zink: track zsbuf info even when rp optimizing is disabled
zink: fix dsa state parsing for tc info
zink: track whether the fb zsbuf is readonly
zink: add a fixup case for readonly zsbuf clears
zink: improve no-oping of write -> readonly zsbuf layouts
zink: don’t flag rp layout check on next draw when rp optimizing
zink: allow zink_is_zsbuf_used() without tc / rp optimizing
zink: rename add_implicit_color_feedback_loop()
zink: explicitly eliminate feedback loops for unused zsbufs
zink: further eliminate zs implicit feedback loops for read-only access
zink: split out luminance/alpha clear conversion code for reuse
zink: convert luminance/alpha clear colors in dynamic texture clear
zink: handle swapchain creation failure less lazily
zink: simplify resource_check_defer_buffer_barrier()
zink: delete unused barrier api
zink: never split a renderpass for a loadop change
zink: flag some rp ends as unsafe
zink: add batch refs for framebuffer surfaces on bind and ref update
zink: fix unordered access for image descriptors
zink: force unordered_write=false when binding image descriptors
zink: add an assert to ensure zsbuf invalidation doesn’t break rendering
zink: only run post-fb-unbind layout stuff if the resource isn’t being destroyed
zink: always set sampler layouts when unbinding fb images while rp optimizing
zink: rework handling of unordered->ordered write buffer barriers
zink: don’t update fbfetch in db mode if inputAttachmentDescriptorSize==0
zink: add ZINK_DEBUG=flushsync
zink: track whether zsbuf is unused
zink: flag rp layout change if zsbuf usedness changes on dsa/fs state bind
driconf: make glthread=true default for source games
aux/trace: dump blend states with enums
aux/trace: fix GALLIUM_TRACE_NIR handling
zink: fix some type mismatches for c++ compilation
zink: break out a src region barrier check for reuse
zink: move all barrier-related functions to c++
zink: use c++ template to deduplicate all the buffer barrier code
zink: minor tweaks for image barriers
zink: use c++ template to deduplicate image barrier functions
zink: stop leaking separate shader nir
lavapipe: always copy streamout info when creating shaders
lavapipe: don’t memcpy tess_ccw when copying pipeline library shaders
lavapipe: refactor shader compile functions to not take pipeline params
lavapipe: track bound shader stages on rendering_state
lavapipe: add a device member to rendering_state
lavapipe: stop using rendering_state::pipeline
lavapipe: refactor compute shader binding
lavapipe: merge some loops in handle_graphics_pipeline()
lavapipe: PIPE_SHADER_ -> MESA_SHADER_
lavapipe: don’t access pipeline shader structs as much during bind
lavapipe: pull out dynamic tess origin check in gfx pipeline bind
lavapipe: break out all the important parts of gfx pipeline setting for reuse
lavapipe: delete unused struct member
lavapipe: refactor pipeline destroy a bit
lavapipe: add a ref for the tess_ccw nir on creation
lavapipe: unify lvp_pipeline_nir creation
lavapipe: dynamically bind noop fs at draw time when needed
lavapipe: don’t double unbind gfx stages on pipeline bind
lavapipe: split out gfx stage unbinding
lavapipe: only update shader access for bind/unbind stages
lavapipe: only unset tess_states pointers on tes bind
lavapipe: avoid uniformly unsetting gs_output_lines
lavapipe: move default rasterizer state values to rendering_state init
mapi: add InternalInvalidateFramebufferAncillaryMESA
glthread: add _mesa_glthread_invalidate_zsbuf()
kopper: apply ancillary invalidation through glthread on swapbuffers
llvmpipe: fix linear fs analysis with nonzero fs outputs
llvmpipe: fix handling of unused color attachments
zink: add spirv builder function for terminate
zink: set src access when rebinding buffers, unset unordered_*
zink: fix quads emulation gs with array variables
zink: block resolves where src extents > dst extents
zink: omit VkPipelineVertexInputStateCreateInfo with dynamic vinput
zink: flag vertex buffers for rebind after vstate draws
zink: use search_or_add for masking vstate
zink: bind vertex state directly from draw hook
zink: add another vstate draw template for popcnt presence
zink: explicitly pass null velems when creating pipelines with dynamic vinput
zink: don’t swizzle velems state for vstate draws
zink: use fast popcnt for vstate draws
zink: stop caching vertex states
lavapipe: break out pipeline layout creation for reuse
lavapipe: implement EXT_shader_object
lavapipe: advertise EXT_shader_object
zink: delete shader reordering in assign_io()
zink: add and populate a shader_info struct to zink_shader
zink: pass nir_shader to update_so_info()
zink: generate flat_flags during shader creation
zink: use zink_shader::info instead of zink_shader::nir::info
zink: simplify fbfetch output detection from fs
zink: pass nir directly to zink_shader_tcs_create()
zink: swap nir pointers when compiling compute shaders
zink: directly return nir from zink_shader_tcs_create
zink: streamline nir cloning for assign_io
zink: store nir as serialized on zink_shader structs
zink: simplify assign_io() further
zink: break out nir blob deserializing
zink: move nir cloning out to callers of zink_shader_compile
zink: store num_inlinable_uniforms separately for cs programs
zink: always store nir serialized
zink: be explicit about separate shader dsl indexing during creation
zink: rework choose_pdev (again)
glthread: use id 0 for internal buffer objects
radv: fix leak of nir from retained shaders
zink: don’t try copying multiple results for conditional render copy
zink: more explicitly track/check rp optimizing per-context
zink: don’t access non_fs part of zink_shader from fs
zink: reuse d3d12 variable copying to make passthrough gs more robust
zink: reuse copy_vars for generated tcs
zink: don’t trigger shader variants on pcp change if driver supports dynamic pcp
Revert “zink: don’t trigger shader variants on pcp change if driver supports dynamic pcp”
zink: try to prune resources from barrier jit on fb unbind
lavapipe: copy fragment shader when merging GPL pipelines
lavapipe: refactor/consolidate GPL shader copying
lavapipe: don’t double-inline ubo0
lavapipe: implement inline variant caching
zink: block oom flushes during unordered blits
zink: unroll array loop when copying vars for passthrough shaders
zink: free GPL input/output libs on context destroy to avoid leaking
zink: fix GPL lib leaking
zink: remove redundant ‘blitting’ check in zink_prep_fb_attachment()
zink: break out feedback loop pipeline state flagging for reuse
zink: pre-convert attachment id to attachment idx
zink: eliminate implicit feedback loops on rp begin
zink: track per-image swapchain layouts
zink: handle swapchain handoffs around makecurrent
zink: remove a fixed validation error for ci
mesa/st/program: don’t init xfb info if there are no outputs
zink: remove atomics from zink_query
zink: pass ctx through query destroy paths
zink: always defer query pool deletion
zink: move memoryTypeIndex selection down in general bo allocation
zink: slightly rework memoryTypeIndex selection to pre-determine heap
zink: restore BAR allocation failure demotion
zink: make general bo allocation more robust by iterating
zink: avoid zero-sized memcmp for descriptor layouts
iris: use util_framebuffer_get_num_samples when setting ps dispatch samples
zink: manually re-set framebuffer after msrtss replicate blit
zink: handle ‘blitting’ flag better in msrtss replication
zink: skip msrtss replicate if the attachment will be full-cleared
zink: avoid recursion during msrtss blits from flushing clears
nir/lower_alpha_test: rzalloc state slots
zink: fix non-db bindless texture buffers
zink: emit demote cap when using demote
zink: only print copy box warning once per resource
util/debug: move null checks out of debug message macro
zink: don’t bitcast bool deref loads/stores
drisw: don’t leak the winsys
zink: check for extendedDynamicState3DepthClipNegativeOneToOne for ds3 support
draw: fix viewmask iterating
zink: don’t pin flush queue threads if no threads exist
zink: add z32s8 as mandatory GL3.0 profile attachment format
nir/gs: fix array type copying for passthrough gs
zink: fix array copying in pv lowering
gallivm: break out native vector width calc for reuse
llvmpipe: do late init for llvm builder
zink: break out VkImageViewUsageCreateInfo applying for reuse
zink: reapply VkImageViewUsageCreateInfo when rebinding a surface
draw: fix robust ubo size calc
llvmpipe: fix native vector width init
zink: add extendedDynamicState3DepthClipNegativeOneToOne to profile
zink: only unset a generated tcs if the bound tcs is the generated one
zink: set depth dynamic state values unconditionally
zink: null some descriptor buffer pointers during destruction
zink: sync queries at the end of cmdbufs
cso: unbind fb state when unbinding the context
i915: use util_copy_framebuffer_state to set fb state
i915: use util_unreference_framebuffer_state to unref fb state
iris: use util_unreference_framebuffer_state to unref fb state
softpipe: use util_unreference_framebuffer_state to unref fb state
v3d: use util_unreference_framebuffer_state to unref fb state
vc4: use util_unreference_framebuffer_state to unref fb state
llvmpipe: use util_unreference_framebuffer_state to unref fb state
svga: use util_unreference_framebuffer_state to unref fb state
zink: don’t init mutable resource bit for swapchain images
zink: don’t init mutable for swapchain src during blit
zink: allow vk 1.2 timelineSemaphore feature if extension isn’t supported
zink: stringify unsupported prim restart log error
zink: delete persistent map tracking
zink: add PERSISTENT for db buffer maps
zink: delete unnecessary pipeline stage flags from inference
zink: use an intermediate variable for binding ssbo slots
zink: unbind the ssbo slot being iterated, not the index of the buffer
zink: flush INDIRECT_BUFFER mem barrier for compute
zink: disable batched unordered barries with ZINK_DEBUG=noreorder
zink: block batching of unordered barriers if previous usage was write
zink: fix uncached memory readback
glsl/lower_samplers_as_deref: apply bindings for unused samplers
zink: bind bindless db set when updating separate shader db sets
zink: compare desc set to detect bindless vars in separate shaders
zink: adjust bindless texel buffer handle before indexing
zink: block more flushes during unordered blits
zink: also cache swapchain semaphores
Mohamed Ahmed (3):
vulkan/runtime: move common buffer related entrypoints to vk_buffer.c
vulkan/runtime: implement vkGetBufferMemoryRequirements2()
anv: remove GetBufferMemoryRequirements2()
Nanley Chery (16):
docs: Document the implicit barriers around blits
glsl: Add compute shaders to encode DXT5/BC3
glsl: Modify the #includes in the DXT5 shaders
mesa: Create _mesa_CreateShaderProgramv_impl
mesa/st: Add get_compute_program
mesa/st: Add and use create_bc1_endpoint_ssbo
mesa/st: Add st_compute_transcode_astc_to_dxt5
mesa/st: Add st_texture_image_resource_level
mesa/st: Enable compute-based transcoding to DXT5
mesa/st: Measure compressed fallback unmap paths
iris: Update comment in iris_cache_flush_for_render
iris: Flush caches for aux-mode changes more often
iris: Drop iris_cache_flush_for_render
iris: Allocate ZEROED BOs for shared resources
iris/bufmgr: Add and use zero_bo
iris/bufmgr: Handle flat_ccs for BO_ALLOC_ZEROED
Nataraj Deshpande (1):
anv: Bump VkDeviceMemory objects limit to 4GB
Neha Bhende (1):
docs: Add GL 4.3 support info in mesa docs
Nicolas Dufresne (1):
util/format: Fix wrong colors when importing YUYV and UYVY
Nicolas F (1):
driconf: remove the adaptive sync special case for mpv
Oleksii Bozhenko (5):
glsl: fix gl_CullDistance lowering from float[8] to vec4[2]
ci: Uprev Piglit
Move combining clip and cull optimization before linking
wsi: add rgb_component_bits_are_equal
wsi: remove get_sorted_vk_formats duplication
Patrick Lerda (25):
lima: fix memory leak related to u_transfer_helper_create()
mesa/program: fix memory leak triggered by parser errors
mesa/st: fix possible crash related to arb invalid memory access
r600: fix shader blob memory leak
vbo/save: fix possible crash related to fixup_vertex()
mesa/shaderapi: fix path memory leak
mesa/framebuffer: fix gl_framebuffer.resolve refcnt imbalance
mesa/program: fix memory leak triggered by invalid extended swizzle selector
mesa/program: fix memory leak triggered by multiple targets used on one texture image unit
mesa/program: fix memory leak triggered by arb alias
radeonsi: fix memory leak related to ureg_get_tokens()
glx: fix memory leak related to __glXCloseDisplay()
r600: fix refcnt imbalance related to shader
intel: fix memory leak related to brw_nir_create_passthrough_tcs()
r600: fix typo that could lead to a possible crash
egl: fix memory leak related to _eglRefreshDeviceList()
r600: fix refcnt imbalance related to r600_set_vertex_buffers()
r600: fix refcnt imbalance related to evergreen_set_shader_images()
lima: fix refcnt imbalance related to framebuffer
r600/sfn: fix memory leak related to sh_info->arrays
aux/draw: fix memory leak related to ureg_get_tokens()
crocus: fix refcnt imbalance related to framebuffer
crocus: fix refcnt imbalance related to crocus_create_surface()
r600: fix refcnt imbalance related to atomic_buffer_state
radeonsi: set proper drm_amdgpu_cs_chunk_fence alignment
Paul Gofman (1):
driconf: add a workaround for Kaiju-A-Gogo
Paulo Zanoni (8):
anv: don’t leave undefined values in exec->syncobj_values
anv: check the return value of anv_execbuf_add_bo_bitset()
anv: run buf_finish() if add_bo() fails during execute_simple_batch()
anv: rename anv_execbuf->array_length to bo_array_length
anv: use vk_realloc for the anv_execbuf arrays
hasvk: don’t leave undefined values in exec->syncobj_values
hasvk: check the return value of anv_execbuf_add_bo_bitset()
anv: there’s no need to set exec_obj offsets twice
Pavel Ondračka (16):
nir/lower_bool: ntt: Generate a good opcode for bcsel
r300: update rv515 ci failures list
r300: skip sin/cos input range transformation for nine and ntt
r300: remove backend input range transformation for sin and cos
ntt: pass ubo_vec4_max nir_opt_offsets flag through ntt options
r300: set ubo_vec4_max ntt option properly
r300: remove backend negative addressing emulation
nir: nir opt_shrink_vectors whitespace fix
nir: mark progress when removing trailing unused alu channels
nir: mark progress when removing trailing unused load_const channels
r300: set register file to none if swizzles are constant only
nir: shrink phi nodes in nir_opt_shrink_vectors
r300: drop VDPAU support
r300: simplify KILL transformation
nine: use separate register for aL emulation
r300: fix unconditional KIL on R300/R400
Pedro J. Estébanez (4):
spirv_to_dxil: Unify spirv_to_nir_options
spirv2dxil: Split read-only image as SRV logic into declared and inferred
spirv: Assume input attachments are read-only
Revert “microsoft/compiler: Use SRVs for read-only images”
Philip Langdale (1):
radeonsi: correctly declare YUV420_10 RT Format support for AV1
Philipp Zabel (2):
vulkan/wsi/wayland: fix acquire_next_image to report timeouts properly
zink: fix build with -Dvulkan-beta=true
Pierre-Eric Pelloux-Prayer (37):
radeonsi: simplify dpbb settings
ac/info: move pci bus info in a struct
ac: add ac_query_pci_bus_info helper
ac: don’t call ac_query_pci_bus_info from ac_query_gpu_info
radeonsi/sqtt: don’t read results for disabled SEs
radeonsi/sqtt: disable SE1+ on GFX11
radeonsi/sqtt: update registers for gfx11
radeonsi/sqtt: implement offset workaround for gfx11
vbo: remove bogus assert
vbo: lower VBO_SAVE_BUFFER_SIZE to avoid large VRAM usage
glthread: fix glArrayElement handling
drm-uapi/dma-buf.h: use __u32/__u64 types
winsys/amdgpu: use DMA_BUF_SET_NAME_B if available
radeonsi/gfx11: clamp PRIM_GRP_SIZE
radeonsi/gfx11: fix ge_cntl programming
amd/surface: fix base_mip_width of subsampled formats
winsys/amdgpu: use amdgpu_device_get_fd
radeonsi/video: use specific PIPE_BIND_ value for video buffers
radeonsi: fix incorrect vgpr indices in the ps_prolog
radeonsi/test: use gbm-skips.txt
radeonsi/test: update test results
radeonsi: don’t use PKT3_SET_SH_REG_INDEX on gfx9 and older
radeonsi: fix fast depth_clear_value/stencil_clear_value
yegl/wayland: fix glthread deadlocks
Revert “driconf: add a workaround for plasmashell freezing”
ac/llvm: fix build with LLVM 17
mesa: fix CopyImageSubDataOES with GL_TEXTURE_EXTERNAL_OES
amd/surface: rename metadata functions
ac/surface: introduce umd metadata v2
radeonsi: add AMD_DEBUG=extra_md
radeonsi: don’t use si_decompress_dcc if the blitter is running
radv: add RADV_DEBUG=extra_md
radeonsi: don’t use alignment_log2 of imported buffers
mesa: fix invalid index_bo refcounting
util/vbuf: clarify indirect draws handling
util/vbuf: fix index_bo leak
radeonsi: update test results
Pino Toscano (1):
symbols-check: support OSes based on GNU toolchain
Qiang Yu (78):
radeonsi: implement nir_load_ring_gsvs_amd
radeonsi: implement nir_load_ring_gs2vs_offset_amd
radeonsi: lower nir streamout intrinsics in abi
radeonsi: use nir_print_xfb_info to replace si_dump_streamout
radeonsi: use ac_nir_lower_legacy_vs to replace si_llvm_vs_build_end
radeonsi: add nir implementation of gs copy shader generation
radeonsi: build legacy gs output info when shader compile
radeonsi: replace llvm gs copy shader generation with nir
radeonsi: remove llvm gs copy shader generate
radeonsi: replace llvm legacy gs code with nir lowering
radeonsi: move gfx10_ngg_export_vertex to si_shader_llvm.c
gallium/aux: remove nir_helpers
nir/xfb_info: nir_gather_xfb_info_from_intrinsics update nir xfb_info
radeonsi: update nir xfb info after medium io lowering
nir: add nir_export_amd intrinsic
ac/llvm: implement nir_export_amd
aco: implement nir_export_amd
ac/nir: gs and nogs use ac_nir_export_primitive
ac/nir: add ac_nir_export_position
ac/nir: add ac_nir_export_parameter
ac/nir: add force_vrs to ac_nir_export_position
amd,radeonsi: implement nir_load_force_vrs_rates_amd in driver abi
radeonsi: clamp vertex color in legacy gs instead of gs copy shader
radeonsi: update outputs written nir info
radeonsi: remove the extra handling for VS/TES primitive id
radeonsi: set nr_pos_exports outside of llvm translation
ac/nir,radv,radeonsi: legacy vs use ac_nir_export_(position|parameter)
ac/nir,radv,radeonsi: gs copy shader use ac_nir_export_(position|parameter)
ac/nir/ngg: fix clip dist culling mask uninitialized
ac/nir/ngg: change clipdist_neg_mask_var type to uint32
ac/nir/ngg,radv,radeonsi: nogs use ac_nir_export_(position|parameter)
ac/nir/ngg: prepare gather_vs_outputs to be used by gs
ac/nir/ngg: gs use ac_nir_export_(position|parameter)
ac/nir/ngg,radv: ms use ac_nir_export_(primitive|position|parameter)
nir,ac/llvm,aco: remove nir_export_primitive_amd
nir,ac/llvm,aco,radv,radeonsi: remove nir_export_vertex_amd
aco: remove early_rast wait insert
radv: move radv_consider_force_vrs above radv_fill_shader_info
radv: use amd common force_vrs option
ac/llvm,radeonsi: lower nir_load_barycentric_at_sample in abi
radeonsi: add num_component param to load_internal_binding
ac/llvm,radeonsi: lower fbfetch in abi
radeonsi: only init llvm output when needed.
ac/llvm: only init outputs when fragment shader for radv
aco: only ls and ps use store output now
aco, radv: Add load_grid_size_from_user_sgpr to aco options.
aco, radv: Move is_trap_handler_shader to aco info.
ac/nir: move store_var_components to common place
ac/nir: tcs write tess factor support pass by reg
ac/nir: init tess factor location with IO remap
ac/nir: handle tess factor output missing case
ac/llvm,radeonsi: lower nir_load_ring_tess_factors_amd
radeonsi: lower nir_load_ring_tess_factors_offset_amd
radeonsi: monolithic TCS emit tessfactor in nir directly
ac/llvm: respect channel_type when ac_build_buffer_load
ac/llvm: add missing type convert for nir_load_buffer_amd
nir: pack_(s|u)norm_2x16 support float16 as input
ac/llvm: implement float16 nir_op_pack_(s|u)norm_2x16
aco: implement float16 nir_op_pack_(s|u)norm_2x16
nir,radeonsi: add and implement nir_load_alpha_reference_amd
nir: add nir_fisnan helper function
ac/nir: add ac_nir_lower_ps
radeonsi: monolithic PS emit epilog in nir directly
radeonsi: expose si_nir_load_internal_binding
ac/nir: add ac_nir_load_arg_at_offset
radeonsi: add si_nir_lower_vs_inputs
ac/llvm: vs_rel_patch_id can also be fixed up
ac/llvm: move ac_fixup_ls_hs_input_vgprs to amd common
radeonsi: monolithic VS emit prolog in nir directly
ac/llvm,radeonsi: remove abi->load_inputs implementation
ac/llvm: remove ac_build_opencoded_load_format
radeonsi: fix max scrach lds size calculation when ngg
ac/nir/ngg: fix gs culling vertex liveness check for odd vertices
ac/nir/ngg: fix store shared alignment
ac/llvm: remove some unused code replaced by nir
ac,aco: move gfx10 ngg prim count zero workaround to nir
aco: fix nir_f2u64 translation
ac/nir/cull: fix line position w culling
Raun (2):
dzn: Enable VK_KHR_bind_memory2
dzn: Enable VK_KHR_get_memory_requirements2
Rhys Perry (48):
radv: implement GS load_ring_gsvs_amd/load_ring_gs2vs_offset_amd
radv,aco: use ac_nir_lower_legacy_gs
aco: restore semantic_can_reorder for GS output stores
ac/nir: use store_buffer_amd’s base index
ac/llvm: add support for fp32 addition atomics
aco: add support for fp32 addition atomics
radv: load ssbo_atomic_fadd descriptor
radv/gfx11: expose shaderBufferFloat32AtomicAdd
aco/tests: fix assembler.gfx11.vop12c_v128 with LLVM 15
aco/tests: update assembler tests for latest LLVM 16
radv: skip creation of null TLAS for null winsys
aco: set has_color_exports with GPL
aco: end reduce tmp after control flow, when used within control flow
aco/tests: add setup_reduce_temp.divergent_if_phi
aco/spill: always end spill vgpr after control flow
aco: limit VALUPartialForwardingHazard search
radv: set state.vbo_misaligned_mask_invalid in radv_bind_vs_input_state
ac: move ring_offsets to ac_shader_args
ac/llvm: let ring_offsets be accessed like a normal arg
radv/llvm: use the ring_offsets shader arg
aco: fix out-of-bounds access when moving s_mem(real)time across SMEM
aco: don’t modify exec in p_interp_gfx11
aco: don’t apply modifiers through DPP to unsupported instructions
aco: fix pathological case in LdsDirectVALUHazard
aco: always update orig_names in get_reg_phi()
radv: remove is_internal pipeline creation parameter
aco/tests: add tests for v_fma_f32 with 2 fp16 literals
aco: make IDSet sparse
nir/range_analysis: fix vectorized phis and intrinsics
nir: use xyzw order for precise fdot
nir: make fdph lowering match fdot
nir: add nir_lower_alu_width_test.fdot_order
aco/gfx11: fix RT prolog scratch initialization
aco: set needs_flat_scr=true for RT
util/dynarray: allow an initial stack allocation to be used
nir/range_analysis: add missing masking of shift amounts
nir/range_analysis: add helpers for limiting stack usage
nir/range_analysis: use perform_analysis() in nir_unsigned_upper_bound()
nir/range_analysis: use perform_analysis() in nir_analyze_range()
radv: fix setting radv_shader_info::user_data_0 with rt
aco: don’t optimize s_or_b64(v_cmp_u_f32(a, b), cmp(a, a))
aco: fix nir_var_shader_out barriers for task shaders
radv/gfx11: improve RT scratch allocation
nir: make nir_fisnan helper exact
aco: remove SMEM_instruction::prevent_overflow
ac/nir/ps: fix null export write mask miss set to 0xf
aco: don’t move exec reads around exec writes
aco: don’t move exec writes around exec writes
Rob Clark (180):
freedreno/ci: Switch a630 jobs over to manual
freedreno/ci: Cleanup a618 yaml
freedreno/ci: Add a618 egl/skqp/piglit jobs
Revert “freedreno/ci: Switch also performance a630 job to manual”
Revert “freedreno/ci: Switch a630 jobs over to manual”
freedreno/ci: Add an a618 flake
freedreno/drm: Remove assert
freedreno: Fix tracking of enabled SSBOs
freedreno/a6xx: Workaround for no pos/psize
freedreno: Don’t re-install a flushed batch
freedreno/a6xx: Rework barrier handling
freedreno/ir3: Stop copying options
freedreno/ir3: Let driver specify fb-read descriptor
freedreno: Track image/SSBO usage for all stages
freedreno/ir3: Add descriptor set lowering
freedreno/a6xx: Pre-bake IBO descriptor sets
freedreno/a6xx: Add bindless state
freedreno/a6xx: Switch over to bindless IBO
freedreno/a6xx: Remove bindfull IBO state
freedreno/a6xx: Removing munging of tex state for IBO
freedreno/a6xx: Remove tex fb_read state
freedreno/a6xx: Move compute to tex state group
freedreno/a6xx: Move tex state building
freedreno/a6xx: Expose SSBO/image for all shader stages
freedreno: Restore GL_VENDOR string
gallium/util: Add util_writes_depth() helper
freedreno/a6xx: Add LRZ perf warn for ztest direction changes
freedreno/a6xx: Invalidate LRZ on blend+depthwrite
turnip: Rename lrz force_disable_mask
turnip: Invalidate LRZ on blend+depthwrite
util/xmlconfig: Use os_get_option()
freedreno: Add driconf to disable conservative LRZ
freedreno/a6xx: Add a few kernel regs/etc
freedreno/drm: Add some ref/unref debugging
freedreno/drm: Detect zombie BOs
freedreno/drm: Remove bo_del_or_recycle()
freedreno/drm: Split out bo->finalize()
freedreno/drm: Synchronize handle close and lookup
freedreno/drm/virtio: Flush before CREATE_BLOB
freedreno/drm: Restart import on zombie race
freedreno/gmem: Fix for partial z/s fast-clear
freedreno/decode: Increase size of offsets table
freedreno/a6xx: LRZ for MSAA
freedreno/ir3: Scalarize load_ssbo
freedreno/a6xx: Add missing CS_BINDLESS mapping
freedreno/a6xx: Add CS instrlen workaround
freedreno: nondraw-batch
freedreno: Skip flush_resource with explicit sync
freedreno/a6xx: Don’t double-write SP_CS_OBJ_START
freedreno: Don’t open-code setting dirty CS state
freedreno/a6xx: Make shader state independent of grid info
freedreno/a6xx: Also FLUSH_CACHE on image barrier
freedreno/a6xx: Remove excess CS flushing
freedreno+ir3: Move storage_16bit to compiler options
freedreno/a6xx: Move CS state to PROG state group
freedreno/drm: Move sa_cpu_prep() to core
freedreno/drm/virtio: Limit guest handles passed to virtgpu
ir3: Quiet unused variable warning
freedreno: Quiet unused variable warnings
freedreno/a2xx: Move pack_rgba()
freedreno: Indent fixes
freedreno/a6xx: Move num_driver_params to program state
freedreno: Move num_vertices calc to backend
freedreno: Remove impossible NULL check
freedreno: Add FD_DIRTY_QUERY
freedreno: Avoid screen lock when no rsc tracking needed
freedreno: Account for multi-draw in num_draws
freedreno: Push num_draws down to backend
freedreno/a6xx: Drop unused return
freedreno/a6xx: Split out flush_streamout() helper
freedreno/a6xx: Multi-draw support
freedreno/a6xx: Do tex-state invalidates in same ctx
freedreno/drm: Make rb refcnt non-atomic
freedreno/a6xx: Remove tex-state refcnting
freedreno: Move blend out of dirty-rsc tracking
freedreno: Move FD_MESA_DEBUG cases out of draw_vbo
freedreno/a6xx: Pre-compute PROG related LRZ state
freedreno: Avoid taking screen lock
freedreno/batch: Stop tracking cross-context deps
freedreno: Drop batch lock
freedreno: Add seqno helper
freedreno/drm: Optimize stateobj re-emit
freedreno/a6xx: Move rsc seqno out of tex cache key
freedreno/a6xx: Fix set_sampler_views(start != 0)
freedreno/a6xx: Drop unneed fd6_texture_state() arg
freedreno/a6xx: Fix sampler view rsc_seqno for X32_S8X24
freedreno/a6xx: Add a way to assert valid format
freedreno/a6xx: Remove needs_invalidate flag
freedreno/a6xx: Small cleanup
freedreno/a6xx: Static-ify sampler_view_update()
freedreno/a6xx: Fix view_seqno in tex cache key
freedreno/cffdec: Fix unitialized count for pkt2
freedreno/cffdec: Add helper to find next pkt
freedreno/cffdec: Add helper to parse CP_INDIRECT_BUFFER
freedreno/cffdec: Fix hang location detection
freedreno/crashdec: Refactor crashdec tests
freedreno/crashdec: Add another prefetch test
freedreno/crashdec: Handle multi-IB prefetching
freedreno/crashdec: Disable GALLIUM_DUMP_CPU
vk/runtime: Allow enumerate and try_create_for_drm to coexist
turnip: Move things to prep for multi-kernel support
turnip: drm code-motion
turnip: Split out vfuncs for kernel interface
turnip: Allow knl backend specific entrypoints
turnip: Move QueueWaitIdle entrypoint to kgsl
turnip: Handle kgsl vs drm specifics at runtime
turnip: Refactor device loading
tu+meson: Re-work KMD selection
tu/kgsl: Propagate tu_physical_device_init() errors
Revert “freedreno: Account for multi-draw in num_draws”
freedreno/a6xx: Namespace reg/pkt packer vars
freedreno/a6xx: Convert blitter to OUT_REG()
freedreno/a6xx: Fix mirror x/y blits
util: Add a simple no-op libdrm shim
turnip: Use libdrm shim
loader: Use libdrm shim
vk/runtime: Use libdrm shim
freedreno/common: Replace or_mask() with BitsetEnum<T>
freedreno: Promote non-drawing batches to sysmem
freedreno: Nerf strict-aliasing warning for all of gcc
freedreno/registers: Schema validation for gen_header.py
freedreno/registers: Add regs for a690
freedreno: Quiet c++ warning about designated initializers
freedreno/ir3: Un-inline enums
freedreno/ir3: Don’t use negative opc for meta instructions
freedreno/ir3: c++-proof the headers
freedreno/ir3+tu: Calculate subgroup size in ir3
freedreno/ir3: Add missing driver params
freedreno: Un-inline buffer-mask enum
freedreno: c++-proofing
freedreno/a6xx: Rework texture_clear fallback
freedreno/a6xx: Add missing “inline”
freedreno/a6xx: Fix designator initializer order
freedreno/a6xx: Convert to c++
freedreno/registers: Fix designator order
freedreno/registers: Add prefix=”variant”
freedreno/registers: Merge a6xx and a7xx regs
freedreno/registers: Start adding a7xx pipe/control regs
freedreno/decode: Start adding a7xx support
freedreno/registers: Start adding stuff for a7xx
freedreno/registers: Track varset
freedreno/registers: Split out regpair builder helper
freedreno/registers: Add c++ magic for register variants
freedreno/registers: Fix nameless fields
freedreno/registers: Define rest of CP_REG_WRITE
freedreno/a6xx: Simplify iova emit
mesa: Rework discard_framebuffer()
driconf: Add ignore_discard_framebuffer option
driconf: Work around incorrect GI discard/invalidate
freedreno: Specify GMEM tile alignment per GPU
freedreno+tu: Big GMEM support
freedreno+tu: Add a690 support
freedreno/a6xx: Restore mode
freedreno/rnn: Fix reg names for regs with variants
freedreno/afuc: Add raw mode for disasm
freedreno/registers: Add control reg for zap fw base
Revert “CI: Disable freedreno”
dri2/android: Bypass throttling
freedreno/drm: Fast path for idle check
freedreno/drm: Stop cleanup at first active BO
mesa: Add a few more function traces
freedreno/drm: Make threaded-submit optional
freedreno/drm: Disable threaded-submit for msm
freedreno: Optimize repeated finishes
freedreno: Stop being too clever by half
freedreno: Hoist dirty vars
freedreno: Extract out a helper
freedreno: Inline single-use helpers
freedreno: Re-work dirty-resource tracking
freedreno: Avoid looping shader stages if nothing dirty
freedreno: Move driconf settings into sub-struct
freedreno: Support the disable_throttling=true driconf option
util/disk_cache: Split out queue initialization
util/disk_cache: Add NONE type
util/disk_cache: Use queue state to skip put
util/disk_cache: Move blob_put_cb to the async queue
freedreno/a6xx: Allow z24s8 format casts
freedreno/a6xx: Fix valid_format_cast logic for newer a6xx
freedreno: Fix resource tracking vs rebind/invalidate
dri/android: Fix MSAA resolve
Rohan Garg (9):
iris: Don’t flush the render cache for a compute batch
anv: drop unused headers
anv: reuse the VK_IMAGE_ASPECT_PLANES_BITS_ANV macro
isl: fix some documentation
anv/blorp: use existing function to convert the op to a string
anv: break out of the loop when the first color attachment is found
anv,hasvk: cleanup unused enum
intel/genxml: Add the preferred slm size enum for gen125
anv,blorp,iris: Set PreferredSLMAllocationSize on gfx125+
Roland Scheidegger (2):
llvmpipe: only use accurate_a0 hack if there are no textures bound
lavapipe, nir: Fix wrong array index scaling in nir_collect_src_uniforms
Rose Hudson (4):
radeonsi: report 0 block size for Polaris HEVC encoding
asahi: wire up shader disk cache support
agx: isolate compiler debug flags
asahi: disable disk cache in debug runs
Ruijing Dong (5):
frontends/va: revert commit 0b02db30
raseonsi/vcn: fix a h264 decoding issue
frontends/va: disable skip_frame_enable in vaapi interface.
radeonsi/vcn: correct cropping for hevc case
radeonsi/vcn: fix decoding bs buffer alignement issue.
Ryan Neph (17):
ci: fix directory existence racing in parallel test execution
util/u_process: add MESA_PROCESS_NAME override to util_get_process_name()
util/u_process: remove util_get_process_name_may_override()
util/xmlconfig: add MESA_DRICONF_EXECUTABLE_OVERRIDE
venus: update venus-protocol headers to partially fix WA1
venus: temporarily redirect VkDrmFormatModifierPropertiesListEXT to “2” variant
ci: uprev virglrenderer
venus: update venus-protocol headers to fix WA1
Revert “venus: temporarily redirect VkDrmFormatModifierPropertiesListEXT to “2” variant”
venus: add vn_relax_init/_fini()
venus: set/check ring status bits independently
venus: init exp features before ring init again
venus: update to latest protocol for ringMonitoring
venus: check and configure new ringMonitoring feature
venus: re-use VN_DEBUG_NO_ABORT to disable ring monitoring abort()
virgl: hook new get_fd proc for drm winsys
i915: hook new get_fd proc for drm winsys
Sagar Ghuge (10):
iris: Stop marking context unconditionally as guilty
intel/fs: Always stall between the fences on Gen11+
nir: Handle other variants of image_samples properly while lowering
intel/compiler: Add swsb_stall debug option
anv: Implement Wa_14015297576
iris: Implement Wa_14015297576
intel/compiler: Add Wa_14014063774 for slm_fence
intel/decoder: Bump the binding table guess value to 32
anv: Drop unused param from add_surface_reloc
anv: Drop dead code that sets the L3BypassDisable field
Sai Teja Pottumuttu (2):
iris: Fix to release BO immediately if not busy
anv: Fix stride mismatch in mesa and minigbm
Sajeesh Sidharthan (3):
radeonsi/vcn: disable fence for JPEG decoding
radeonsi/vcn: set bitstream buffer size to encoded bitstream size
radeonsi/vcn: optimize bitstream buffer resize logic
Sam Edwards (1):
nouveau: Fix null dereference in nouveau_pushbuf_destroy
Samuel Iglesias Gonsálvez (1):
docs/developers: Add Igalia as Mesa consultancy
Samuel Pitoiset (279):
radv: fix missing implementation of creating images from swapchains
radv: fix hashing pipeline keys if RADV_PERFTEST=ngg_streamout is used
radv: fix re-emitting RB+ when the non-compacted color format changes
ac/nir: clear unused components before storing XFB outputs to LDS
ac: add TC_OP_ATOMIC_SUB_32
radv: fix setting MAX_MIP for BC views
radv: fix buffer to image copies with BC views on the graphics queue
radv: fix creating BC image views when the base layer is > 0
radv: rename ac_surf_nbc_view::max_mip to num_levels
radv: move some color blend helpers to radv_private.h
radv: add a new helper for normalizing blend factors
radv: add support for dynamic blend equation
radv: enable compiling PS epilogs on-demand for dynamic color blend equations
radv: fix detecting that blend is enabled when all CB states are dynamic
radv: advertise extendedDynamicState3ColorBlendEquation
radv: remove an old FIXME about a possible bug with TC-compat HTILE
radv/winsys: fix incorrect PCIID for GFX11 in the null winsys
radv: print depth image size with RADV_DEBUG=img
radv: fix RADV_DEBUG=hang with multiple cmdbuffer per submission
radv/winsys: prefix all error messages with RADV
radv: fix creating libraries with PS epilog and all CB states as dynamic
radv: fix ignoring graphics shader stages that don’t need to be imported
radv: add a layer for fixing rendering issues with RAGE2
radv: simplify VK_PIPELINE_CREATE_FAIL_ON_PIPELINE_COMPILE_REQUIRED
radv: pass the number of stages to radv_hash_shaders()
radv: split radv_create_shaders() between graphics and compute shaders
radv: rename radv_create_shaders() to radv_graphics_pipeline_compile()
radv: ignore all CB dynamic states when there is no color attachments
radv: regroup dynamic states initialization
radv: only initialize non-zero values for the default dynamic state
radv: stop setting INTERPOLATE_COMP_Z
radv: fix RB+ for SRGB formats
radv: adjust ACCUM tessellation fields on GFX11+
radv: fix GPL fast-linking with libs that have retained NIR shaders
radv: skip shaders cache for fast-linked pipelines with GPL
radv: remove useless check about CS in radv_lower_io()
radv: simplify pipeline_has_ngg during graphics shaders compilation
radv: add helpers for capturing shaders and statistics
radv: pass radv_graphics_pipeline to radv_graphics_pipeline_compile()
radv: move retained shaders info to radv_graphics_pipeline
radv: pass radv_compute_pipeline to radv_compute_pipeline_compile()
radv: pass pCreateInfo to radv_graphics_pipeline_compile()
radv: optimize radv_pipeline_layout_add_set() slightly
radv: remove redundant zero initialization of pipeline layout
radv: remove radv_pipeline_stage::spirv::sha1
radv: allow to create a noop FS in a library with GPL
radv: remove one unused variable in radv_graphics_lib_pipeline_init()
radv: pass the lib flags for generating the pipeline key
radv: return a boolean value in radv_pipeline_needs_dynamic_ps_epilog()
radv: stop using the graphics pipeline key after compilation
radv: determine the last VGT API stage earlier
radv: skip compilation when possible with GPL fast-linking
radv: simplify an assertion after considering RADV_FORCE_VRS
radv: do not insert fast-linked libraries to the shaders cache
radv: fix skipping graphics pipeline compilation when the FS is NULL
radv: cleanup graphics pipeline library flags uses
radv: simplify determining when the fragment shader needs an epilog
radv: regroup PS epilog info when generating the graphics pipeline key
radv: fix disabling MRT compaction for on-demand PS epilogs
radv: make sure to disable MRT compaction when compiling a PS epilog with GPL
radv: simplify creating a FS epilog from a library
radv: stop skipping the cache for compute/raytracing pipelines with GPL
radv: stop skipping the cache for monolithic graphics pipelines with GPL
docs: add missing RADV_PERFTEST=video_decode
docs: stop reporting RADV_PERFTEST=gpl as experimental/suboptimal
radv/ci: set RADV_PERFTEST=GPL for all VKCTS jobs
radv/ci: bump the number of runners to 3 for vkcts-navi21-valve
radv: restore uploading shaders individually instead of consecutively
radv: implement graphics shaders relocation for a RGP workaround
radv: fix importing retained NIR shaders when a lib uses the RETAIN bit
radv: use last_vgt_api_stage for determining the last stage with XFB
radv: only initialize shader arguments for the active stages
radv simplify compiling graphics shaders with a mask of active NIR stages
radv: disable DCC for mipmaps on GFX11
radv: ignore registering pipeline libaries with SQTT
radv/ci: add missing expected failures with RADV_PERFTEST=gpl on GFX1100
radv: reduce maximum line width to 8.0
radv: add support for rectangularLines
Revert “radv: acquire pstate on-demand when capturing with RGP”
radv/amdgpu: only set a new pstate if the current one is different
radv: only skip emitting the pipeline blend state if the FS uses an epilog
radv: stop using a PS epilog when the FS doesn’t write any color outputs
ci: uprev vkd3d-proton
zink/ci: skip KHR-GL46.texture_swizzle.functional with RADV
zink/ci: set RADV_PERFTEST=gpl for RADV jobs
radv/ci: disable vkcts-kabini-valve
radv/ci: move CI lists for external GPUs in separate folder
radv: configure SQ_THREAD_TRACE_CTRL.REG_AT_HWM on GFX11
radv: only enable SQTT for SE0 on GFX11
radv: make sure to wait for the trace buffer also on GFX11
radv: implement a workaround for SQTT on GFX11
radv: disable SPM counters with RGP on GFX11
radv: enable SQTT tracing on GFX11
radv: set VS_OUT_MISC_SIDE_BUS_ENA for clip distances on GFX10.3+
radv/ci: cleanup CI lists for dEQP-VK.memory.* tests that timeout
ac/nir: add resinfo lowering for sliced storage 3D views
radv: implement VK_EXT_image_sliced_view_of_3d on GFX10+
radv: advertise VK_EXT_image_sliced_view_of_3d on GFX10+
radv: cleanup radv_emit_{conservative,msaa}_state() functions
radv: stop setting ENABLE_POSTZ_OVERRASTERIZATION to 1
radv: set MSAA_NUM_SAMPLES to 0 for underestimate rasterization
radv: enable primitiveUnderestimation on GFX9+
zink/ci: skip one more test that timeout with RADV
radv: fix flushing non-coherent images inside secondaries on GFX9+
radv: fix flushing non-coherent images in EndCommandBuffer()
radv: fix draw calls with 0-sized index buffers and robustness on NAVI10
radv: only expose EXT_pipeline_library_group_handles if RT is enabled
amd,ac/rgp: fix SQTT memory types
radv: ignore alpha_is_on_msb on GFX11 because the hw ignores it
radv: use new EVENT_WRITE_ZPASS packet3 on GFX11
radv: fix DCC decompress on GFX11
radv: stop allocationg the attr ring BO for compute queues on GFX11
ci: uprev CTS to 1.3.5.0
radv/ci: adjust timeouts for Vega10 and Renoir
radv/ci: stop skipping some graphics pipeline library tests
radv/ci: update CI lists for CTS 1.3.5.0 on GFX110/POLARIS10/PITCAIRN
wsi: move an assertion in wsi_xxx_surface_get_capabilities2()
radv: do not add descriptor BOs on update when the global BO list is used
radv: fix incorrect stride for primitives generated query with GDS
radv: fix border color swizzle for stencil-only format on GFX9+
radv: fix defining RADV_USE_WSI_PLATFORM
radv: move disabling DCC for VRS rate images in radv_get_surface_flags()
ac/surface: add RADEON_SURF_VRS_RATE for selecting swizzle mode on GFX11
radv: add support for VRS attachment on GFX11
radv: do not emit PA_SC_VRS_OVERRIDE_CNTL from the pipeline on GFX11
radv: advertise attachmentFragmentShadingRate on GFX11
radv: enable VK_KHR_fragment_shading_rate on GFX11
radv: disable DCC with signedness reinterpretation on GFX11
radv: move instance related code to radv_instance.c
radv: move physical device related code to radv_physical_device.c
radv: move queue related code to radv_queue.c
radv: move sampler related code to radv_sampler.c
radv: move event related code to radv_event.c
radv: move buffer related code to radv_buffer.c
radv: move device memory related code to radv_device_memory.c
radv: zero-initialize radv_shader_args right before declaring them
radv: zero-initialize radv_shader_info earlier for graphics pipeline
radv: fix the error code when the driver fails to create a PS epilog
radv: determine if a graphics pipeline needs a noop FS earlier
radv: keep track of the retained NIR shaders sha1 for LTO pipelines
radv: allow to cache optimized (LTO) pipelines with GPL
radv: rename RADV_PIPELINE_LIBRARY to RADV_PIPELINE_RAY_TRACING_LIB
radv: add helpers for destroying various pipeline types
radv: fix NGG streamout with VS and GPL on GFX11
spirv: add SpvCapabilityFragmentFullyCoveredEXT
spirv,nir: add support for SpvBuiltInFullyCoveredEXT
radv: lower nir_intrinsic_load_fully_covered
radv: enable SAMPLE_COVERAGE_ENA if the fully covered built-in is used
radv: implement fullyCoveredFragmentShaderInputVariable
radv: enable fullyCoveredFragmentShaderInputVariable on GFX9+
radv: remove set but never used num_preserved_sgprs
radv: stop storing the binary as part of radv_shader_part
radv: store spi_shader_col_format to radv_shader_part_binary
radv: store the total radv_shader_part_binary size
radv: upload prologs/epilogs as part of radv_shader_part_create()
radv: allow to return the PS epilog binary to the pipeline
radv: make radv_shader_part_create() non-static
radv: add support for caching PS epilogs
radv: stop using radv_get_shader_shader() for task shaders
radv: replace radv_lookup_user_sgpr() by radv_get_user_sgpr()
radv: pass shader/base_reg to radv_emit_descriptor_pointers()
radv: pass shader/base_reg to radv_emit_inline_push_consts()
radv: pass shader/base_reg to radv_emit_userdata_address()
radv: use a separate compute path in radv_flush_constants()
radv: pass radv_shader to radv_dump_shader_stats()
radv: rework dumping shaders when a GPU hang is reported
radv: pass radv_ray_tracing_pipeline to radv_rt_pipeline_compile()
radv: pass radv_shader to radv_shader_need_indirect_descriptor_sets()
radv: implement VK_KHR_map_memory2
radv: use common GetBufferMemoryRequirements2()
radv: move cs_regalloc_hang_bug to radv_shader_info
radv: pass a radv_shader to radv_emit_dispatch_packets()
radv: remove radv_pipeline::device completely
radv: rework binding shaders to cmdbuf by introducing new helpers
radv: move radv_meta_* to a new folder
radv: copy the multisample state to radv_cmd_state
radv: move uses_user_sample_locations to radv_multisample_state
radv: separate the sample shading state between FS and graphics pipeline
radv: add DI_PT_RECTLIST to si_conv_prim_to_gs_out()
radv: stop checking dynamic states when emitting the guardband state
radv: rename gfx9_gs_info to radv_legacy_gs_info
radv: move {esgs,gsvs}_ring_size to radv_legacy_gs_info
radv/rt: bind the pipeline stack when it’s not dynamic
radv/ci: update CI lists for Polaris10 and Pitcairn
radv: stop using get_vs_output_info() when emitting VS/NGG shaders
radv: emit the GS copy shader outside of radv_pipeline_emit_hw_gs()
radv: add radv_get_last_vgt_shader() helper
radv: stop using the pipeline for emitting PS inputs
radv: use the shader info stage to simplify emitting NGG shaders
radv: use the ES type to apply a workaround for NGG on GFX10
radv: pass the ES shader to radv_pipeline_emit_hw_ngg()
radv: stop using the pipeline for emitting shaders
radv: pass shader/base_reg to radv_emit_view_index_per_stage
radv: pass a shaders array to radv_get_shader()
radv: add radv_bind_shader() helper
radv: add an assertion about shader stage to radv_bind_pre_rast_shader()
radv: keep track of active stages as part of the cmdbuf state
radv: determine the last VGT shader at pipeline bind time
radv: stop using last_vgt_api_stage_{locs} during cmdbuf recording
radv: move dirtying flags for mesh shading to radv_bind_pre_rast_shader()
radv: copy bound shaders to the cmdbuf state
radv: determine and store the next graphics stage to radv_shader_info
radv: move user_data_0 to the shader info pass
radv: replace pipeline->is_ngg occurrences during cmdbuf recording
radv: replace pipeline->force_vrs_per_vertex during cmdbuf recording
radv: use serialized NIR for graphics libs with the RETAIN flag
radv: remove radv_graphics_pipeline::use_per_attribute_vb_descs
radv: remove radv_graphics_pipeline::last_vertex_attrib_bit
radv: remove radv_graphics_pipeline::next_vertex_stage
radv: remove radv_graphics_pipeline::can_use_simple_input
aco: remove unused aco_shader_info::vb_desc_usage_mask
radv: adjust vb_desc_usage_mask for dynamic VS inputs in the info pass
radv: remove radv_graphics_pipeline::vb_desc_usage_mask
radv: remove radv_graphics_pipeline::vb_desc_alloc_size
radv: rework emitting inner coverage when a fragment shader is bound
radv: copy custom blend mode to the cmdbuf state
radv: add a helper that returns the current rasterized primitive
radv: copy rast_prim to the cmdbuf state
radv: copy uses_{drawid,baseinstance} to the cmdbuf state
radv: copy ia_multi_vgt_param to the cmdbuf state
radv: add a helper to convert a VkPipelineBindPoint
radv: copy need_indirect_descriptor_sets to radv_cmd_state
radv: add push constant state to the cmdbuf state
radv: fix sample shading when a new fragment shader is bound
vulkan: add dynamic support for rectangles enable/mode
radv: add dynamic support for rectangles enable/mode
vulkan: Update XML and headers to 1.3.246
radv: copy db_render_control to the cmdbuf state
radv: set PS_ITER_SAMPLE(1) for sample shading during cmdbuf recording
radv: configure PA_SC_MODE_CNTL_1 during cmdbuf recording
radv: add the raygen shader BO to the cmdbuf list
radv: fix binding raytracing/compute pipelines
zink/ci: remove primitive-id-no-gs-quads from the NAVI10 fail list
radv/ci: add one more flake
radv: only copy non-NULL shaders when loaded from the cache
radv: rely on non-NULL binaries when inserting shaders to the cache
radv: allow to create/insert PS epilogs from/to the cache for libs
radv: remove dead code in radv_pipeline_get_nir()
radv: add VkGraphicsPipelineLibraryFlag to the graphics pipeline key
radv: ensure to retain NIR shaders for GPL libs found in the cache
radv: enable shaders cache for libraries with GPL
radv: fix VS prologs with GPL and static binding stride
radv: emit the PS epilog after the graphics pipeline
radv: add a helper for retaining NIR shaders
radv: move the serialized NIR to radv_graphics_lib_pipeline
radv: simplify a check when retaining NIR shaders
radv: do not retain noop FS for libs when a cache hit happened
radv: import retained NIR shaders later in the compilation process
radv/rt: stop storing unused hashes/identifiers
radv: create a helper for copying VkPipelineShaderStageCreateInfo
radv: copy stages instead of serializing NIR for GPL with the RETAIN flag
radv: enable VK_EXT_graphics_pipeline_library by default
radv/ci: update expected failures for PITCAIRN
radv/ci: remove no longer existing tests for PITCAIRN
radv/ci: update expected failures with BONAIRE
docs: add more release notes for RADV
radv: fix re-emitting vertex user SGPRs when binding a graphics pipeline
radv/ci: remove one RT test from the expected failures on RDNA3
radv: split radv_pipeline.c into radv_pipeline_{compute,graphics}.c
radv: fix pipeline creation feedback with imported graphics libs
radv: cleanup after splitting radv_pipeline.c
radv: fix detecting FMASK_DECOMPRESS/DCC_DECOMPRESS meta pipelines
vulkan: ignore rasterizationSamples when the state is dynamic
radv: try to keep HTILE compressed for READ_ONLY_OPTIMAL layout
radv: re-emit the guardband state when related PSO are bound
radv: disable fast-clears with CMASK for 128-bit formats
radv: do not allow 1D block-compressed images with (extended) storage on GFX6
radv: fix usage flag for 3D compressed 128 bpp images on GFX9
radv: update binning settings to work around GPU hangs
radv/amdgpu: fix adding continue preambles and postambles BOs to the list
radv: wait for occlusion queries in the resolve query shader
radv: delay enabling/disabling occlusion queries at draw time
radv: track DB_COUNT_CONTROL changes to avoid context rolls
radv: add the perf counters BO to the preambles BO list
radv: only enable extendedDynamicState3ConservativeRasterizationMode on GFX9+
ac/nir: fix 8-bit/10-bit PS exports clamping
radv: fix dynamic depth clamp enable support
radv: fix fast-clearing images with VK_REMAINING_{ARRAY_LAYERS,MIP_LEVELS}
radv: disable RB+ blend optimizations on GFX11 when a2c is enabled
Sarah Walker (1):
pvr: Update FWIF transfer queue register structures
Sathishkumar S (8):
radeonsi/vcn: add register definitions for JPEG 4.0.3
radeonsi/vcn: use register versions for jpeg
radeonsi/vcn: add support for picture crop on JPEG 4.0.3
radeonsi/vcn: support ARGB/RGBA conversion on JPEG 4.0.3
radeonsi/vcn: set jpeg reg version for gfx940
radeonsi/vcn: reset to default value when ROI/FC is not used
frontends/va: support crop region in jpeg decode
radeonsi/vcn: enable RGBA/ARGB formats on gfx940 jpeg
Sebastian Wick (1):
loader: do not check the mesa DRI_Mesa version if it was not found
Sergi Blanch Torne (8):
ci: disable Collabora’s LAVA lab for maintance
Revert “ci: Collabora’s LAVA lab for maintance”
ci: Uprev kernel to 6.1.7
ci: disable Collabora’s LAVA lab for maintance
ci: disable Collabora’s LAVA lab for maintance
ci: include setup test environment script in the output artifacts
Revert “ci: disable Collabora’s LAVA lab for maintance”
ci: disable Collabora’s LAVA lab for maintance
Sidney Just (4):
zink: Fix non debug builds failing to compile on
loader: Add missing brace to fix compile
zink: add check for samplerMirrorClampToEdge Vulkan 1.2 feature
zink: Add missing features to the profile file
Sil Vilerino (15):
d3d12: Honor suggested driver profile/level for H264/HEVC encode
d3d12: Video processing - Fix out of bounds array access
d3d12: Video Encode - Fix ID3D12CommandAllocator leak
d3d12: Fix VP9 Decode - Checking 0xFF instead of 0x7F for invalid frame_ref[i].Index7Bits
frontend/va: Add format support checks for VA_RT_FORMAT_* in VaCreateConfig/VaGetConfigAttributes
frontend/va: Remove duplicate code in format support checking/reporting.
frontend/va: Keep track of some VP9 previous frame data for current frame use_prev_in_find_mvs_refs
d3d12: VP9 Decode - Fix use_prev_in_find_mvs_refs calculation
d3d12: Fix video decode for interlaced streams with reference only textures required
d3d12: H264/HEVC Encode - Set both VBV InitialCapacity/Size in CBR Rate Control to same value when requested
d3d12: Encode H264/HEVC - Do not write PPS unless different from active
d3d12: Encode - Only upload headers when written headers size is > 0
nir: Fix use of alloca() without #include c99_alloca.h
Revert “d3d12: Honor suggested driver profile/level for H264/HEVC encode”
d3d12: Video processor to only promote resources to permanent residency when there is work to be flushed
Simon Fels (2):
venus: allow vtest socket being specified by env variable
virgl/vtest: allow socket being specified by env variable
Simon Perretta (38):
pvr: Add new Rogue compiler framework
pvr: Add support for optional instruction params
pvr: Support dual-destination ALU instructions
pvr: Commonise some instruction member defs
pvr: Drop the ENUM_PACKED macro
pvr: Keep NIR SSA defs instead of registers
pvr: Adjust instruction repeat offset
pvr: Validate instruction repeat and src/dst sizes
pvr: Add block printing support during validation
pvr: Clarify unreachable text
pvr: Add ADD64 support
pvr: Add memory load support
pvr: Add bitwise instruction support
pvr: Additional register subarray support
pvr: Support loading immediate values
pvr: Load descriptors from memory
pvr: Split pvr_private.h
pvr: Use descriptor/set/table offsets from driver
pvr: Add NIR pass to lower vars to SSA
pvr: Amend subarray ownership code
pvr: Add support for fitr.pixel
pvr: Add support for sample instructions
pvr: Add support for validating modifier combos
pvr: Add support for emitpix
pvr: Add support for WOP
pvr: Register allocation improvements
pvr: Fix descriptor set address calculation
pvr: Add support for generating per-job EOT program
pvr: Add support for generating NOP program
pvr: Add support for IDF
pvr: Add support for ST
pvr: Add branch support
pvr: Add support for TST
pvr: Add basic support for manual instruction grouping
pvr: Add support for MOVC
pvr: Add late op lowering pass and conditional execution
pvr: Amend definitions for ST and IDF
pvr: Add encodings for index registers
Simon Ser (1):
egl: fix fd_display_gpu on surfaceless and device platforms
Sonny Jiang (5):
radeonsi: Add NV12 support for AV1
gallium/pipe: change PIPE_DEFAULT_DECODER_FEEDBACK_TIMEOUT_NS to 1 second
amd/common: Add gfx940 codec query support
radeonsi/vcn: Add video capabilities support for gfx940
radeonsi/vcn: Add decode support for gfx940
SoroushIMG (31):
zink: add pass checking for lod overflow in txf
zink: add zink_cs_key
zink: add VK_EXT_image_robustness
zink: add robust_access field to shader key
zink: lower LOD-invalid txf when imageRobustAccess2 is missing
zink: update gl43 profile to allow imageRobustAccess
zink: fix sparse residency query and minLOD feature checks
zink: fix cap check for arb sparse texture2
zink: only save frag const buffers when used by blit
zink: fix leak when rebinding same image surface
zink: clear null image surfaces to 0
zink: fix pointcoord y inversion
zink: relax bresenhamLines requirement for non-strictLine drivers
zink: fix compute shader leaks
zink: allocate program shader caches from the program’s mem ctx
zink: stop creating pipeline library cache for non-optimal_key drivers
zink: free resource objects’ views array during destruction
zink: fix stale point sprite mode state
zink: fix shadow mask change logic when binding sampler views
zink: track shadow swizzle for all shader stages
zink: minor formatting change
zink: add needs_zs_shader_swizzle shader key
zink: extend shadow swizzle pass to all zs textures
zink: add depth/stencil needs shader swizzle workaround field
zink: workaround undefined swizzle 1 for z/s textures
zink: rename shadow key to zs swizzle
zink: Add driver name and API version to renderer name
zink: do not emit line stipple dynamic state when emulating
zink: take location_frac into account in lower_line_smooth_gs
zink: fix incorrect line mode check for bresenham
zink: refcount the correct query pool
Sui Jingfeng (1):
meson: add basic support for loongarch
SureshGuttula (1):
radeonsi: Add support for DPB resize
Sviatoslav Peleshko (9):
anv: Handle VkAccelerationStructureBuildRangeInfoKHR::transformOffset
driconf/anv: Apply limit_trig_input_range WA to Rise of the Tomb Raider
iris: Avoid creating uncompressed view with unaligned tile offsets on BDW
anv: Handle all fields in VkAccelerationStructureBuildRangeInfoKHR
anv: Move WA MEDIA_VFE_STATE after stalling PIPE_CONTROL
glsl: Fix codegen for constant ir_binop_{l,r}shift with mixed types
isl: Check all channels in isl_formats_have_same_bits_per_channel
anv: Handle UNDEFINED format in image format list
anv: Improve image/view usage bits verification
Tapani Pälli (42):
intel/compiler: add cpp_std=c++17 when building tests
intel/hasvk: remove some stale comments, wa was removed
anv: add restrictions for 3DSTATE_RASTER::AntiAliasingEnable
hasvk: add restrictions for 3DSTATE_RASTER::AntiAliasingEnable
iris: add restrictions for 3DSTATE_RASTER::AntiAliasingEnable
mesa: move component bits queries as GL ES only
intel/genxml: set unused 3DSTATE_PS_EXTRA field as mbz
intel: enable existing workaround for ICL platform
intel/blorp: disable REP16 for gfx12+ with R10G10B10_FLOAT_A2
iris: disable preemption for 3DPRIMITIVE during streamout
iris: handle error in iris_resource_from_handle
spirv: add workaround for Metro Exodus in spirv_to_nir
radv: revert Metro Exodus workaround which was moved to common code
mesa/st: refactor st_destroy_texcompress_compute condition
mesa/st: add astc decoder lookup tables
mesa/st: initialize resources for ASTC decoding
mesa: add astc decoder shader template (glsl es version)
mesa/st: support compute shader decoding of ASTC
anv: Wa_14016407139, add required pc when SBA programmed
iris: implement emission of 3DSTATE_HS for Wa_1306463417
anv: emit 3DSTATE_HS in cmd_buffer_flush_gfx_state
anv: limit generated draws to pipelines without HS stage
anv: implement emission of 3DSTATE_HS for Wa_1306463417
iris: emit 3DSTATE_HS for each primitive on gfx12
anv: emit 3DSTATE_HS for each primitive on gfx12
intel/compiler: add comment about workaround on simd width
anv: fix sends_count_expectation assert on simd32
intel/isl: disable TILE64 for YCRCB formats
anv: implement occlusion query related Wa_14017076903
iris: implement occlusion query related Wa_14017076903
intel/fs: restore message layout changes for cube array
anv: use primitive ID override when shader does not supply it
anv: take primitive ID override to account Wa_14015297576
anv: check for MESA_SHADER_TESS_CTRL with get_tcs_prog_data
intel/common: limit the amount of SLM with Wa_14017341140
intel/fs: use intel_needs_workaround for Wa_22013689345
intel/compiler: use intel_needs_workaround for Wa_14012437816
isl: disable mcs (and mcs+ccs) for color msaa on gfxver 125
iris: implement state cache invalidate for Wa_16013063087
anv: cleanup bitmask construction for PIPELINE_SELECT
anv: implement state cache invalidate for Wa_16013063087
isl: fix layout for comparing surf and view properties
Tatsuyuki Ishi (22):
radv: Fix depth-only-with-discard when epilogs are used.
radv: Fix emitting tess indirect descriptors twice.
radv: Loop over shader stages in flush_indirect_descriptor_sets.
radv: Fix noop FS not getting constructed for GPL pipelines.
radv: Fix missing rbplus_allowed check for dynamic PS epilogs.
radv: Assert the hardware support rbplus when emitting rbplus state.
radv: Keep shader code ptr in a separately allocated buffer.
radv/sqtt: Use code buffer from radv_shader directly instead of copying.
radv: Replace radv_trap_handler_shader with radv_shader.
radeonsi: SDMA v4 size field is size - 1
radv: SDMA v4 size field is size - 1
radv: Remove SDMA padding from copy helpers.
radv: Use common helpers to translate format in SDMA copy.
radv/rt: Don’t upload the prolog twice.
radv: Use radeon_cmdbuf for sdma_copy_image.
radv: Introduce sdma_copy_buffer for GFX7+.
radv: Upload shaders to invisible VRAM on small BAR systems.
radv: Wait for shader uploads asynchronously.
radv: Fix missing wait of GS copy shader upload for dmashaders.
amd: Add radv_foreach_stage to ForEachMacros.
radv: Pre-compute descriptor set layout hash.
ci/android: Make armv8’s arch aarch64 instead of arm.
Teng, Jin Chung (2):
frontend/va: Add large_scale_tile from VADecPictureParameterBufferAV1
d3d12: AV1 Dec - Set anchor_frame_idx only when large_scale_tile equals 1
Thomas H.P. Andersen (4):
docs/panvk: VK_KHR_descriptor_update_template
meson: use summary()
meson: use sections in summary()
v3dv: use common code for descriptor update template
Thong Thai (6):
gallium/auxiliary/vl: clean-up progressive shader
radeonsi/vcn: use encoder/decoder caps reported by kernel
gallium/auxiliary/vl: add crop to compute shader
mesa/main: rework locale setup/teardown
util: check and initialize locale before using it
tgsi: use locale independent float and double parsing
Timothy Arceri (15):
nir/nir_opt_copy_prop_vars: remove extra loop
nir/nir_opt_copy_prop_vars: avoid comparison explosion
nir/nir_opt_copy_prop_vars: reuse hash tables
nir/nir_opt_copy_prop_vars: reuse dynamic arrays
nir/nir_opt_copy_prop_vars: reorder clone calls
nir/nir_opt_copy_prop_vars: don’t call memset when cloning
ci: enable dEQP-VK.ubo.random.all_shared_buffer.48
glsl: copy prop vars before scalarizing alus
glsl: add _token_list_prepend() helper to the parser
glsl: isolate object macro replacments
glsl: remove do_copy_propagation_elements() optimisation pass
glsl: allow 64-bit integer on RHS of shift
util/00-mesa-defaults: add Akka Arrh workaround
mesa: add _mesa_is_api_gles2() helper
glsl: move some GL ES checks to the NIR linker
Timur Kristóf (155):
aco/optimizer: Add missing v_lshlrev condition to can_apply_extract.
aco/optimizer: Optimize p_extract + v_mul_u32_u24 to v_mad_u32_u16.
radv: Make NGG query emission a dirty flag.
radv: Get rid of app_shaders_internal.
radv, aco: Add uses_full_subgroups to compute shader info.
aco: Enable constant exec mask based optimization on compute shaders.
radv: Lower dynamic VS inputs in NIR.
aco: Remove dynamic VS input loads.
nir: Add pack_half_2x16_rtz_split opcode.
radv, aco, ac: Implement pack_half_2x16_rtz_split.
nir: Lower pack_half_2x16_split to RTZ if available.
nir: Add algebraic optimization for VKD3D-Proton fp32->fp16 conversion.
ac/gpu_info: Add has_pcie_bandwidth_info.
radv: Don’t place CS in VRAM when bandwidth is too low.
nir/opt_algebraic: Add optimization for ieq/ine and right-shift.
radv: Disable NGG culling when conservative overestimation is used.
ac/nir/cull: Alway remove zero-area triangles in face culling.
ac/nir/ngg: Include culled primitives in query.
radv: Don’t change LDS_SIZE for NGG culling shaders.
radv: Move checking primitive topology to radv_get_ngg_culling_settings.
radv: Use shader code to skip NGG culling in small workgroups.
radv: Remove NGG culling skip from command buffer.
radv: Refactor radv_emit_ngg_culling_state so it’s based on dirty flags.
nir: Clarify comment above load_buffer_amd.
ac: Port ACO’s get_fetch_format to ac_get_safe_fetch_size.
ac/llvm: Remove “structurized” argument and instead check vindex.
ac/llvm: Fix buffer_load_amd with larger than 32-bit channel sizes.
ac/llvm: Fix ac_build_buffer_load to work with more than 4 channels.
ac/llvm: Change ac_build_tbuffer_load to take format and channel type.
radv: Move VS input lowering to new file: radv_nir_lower_vs_inputs.
aco: Get rid of redundant load_vmem_mubuf function.
aco: Don’t set scalar offset on buffer load instructions when it’s zero.
aco: Remove MTBUF zero operand.
radv: Call nir_lower_array_deref_of_vec in radv_lower_io_to_scalar_early.
aco/optimizer: Change v_cmp with subgroup invocation to constant.
radv: Emulate VGT_ESGS_ITEMSIZE in shaders on GFX9+.
util: Add util_format_get_array.
ac: Add pending_vmem field to args.
radv: Set pending_vmem on dynamic VS input args.
aco: Generalize vs_inputs to args_pending_vmem.
aco, radv: Rename aco_*_key to aco_*_info.
aco, radv: Move PS epilog and VS prolog args to their info structs.
aco, radv: Don’t use radv_shader_args in aco.
aco: Don’t include headers from radv.
ac/nir: clear nir_var_shader_out from TCS barriers
aco: Remove vtx_binding from MUBUF/MTBUF instructions.
nir: Add load_typed_buffer_amd intrinsic.
aco: Implement load_typed_buffer_amd.
ac/llvm: Implement typed buffer load intrinsic.
radv: Lower non-dynamic VS inputs in NIR.
radv: Apply swizzle and alpha adjust in radv_nir_lower_vs_inputs.
aco: Remove VS inputs from visit_load_input.
aco: Rename visit_load_input to visit_load_fs_input.
radv: Remove VS inputs code from LLVM backend.
ac/llvm: Remove unused function ac_build_struct_tbuffer_load.
aco, radv: Remove VS IO information from ACO.
aco: Don’t add soffset to swizzled MUBUF base.
aco: Use zero for MUBUF/MTBUF when soffset is undefined.
aco: Disable MUBUF/MTBUF offsets when they are zero.
aco: Always enable idxen for swizzled buffer access on GFX11.
ac/nir/ngg: Remove usused lds_es enum values.
ac/nir/ngg: Rename saved_uniform to reusable_nondeferred_variable.
ac/nir/ngg: Split some functions out of save_reusable_variables.
ac/nir/ngg: Move divergence analysis call to analyze_shader_before_culling.
ac/nir/ngg: Rename state variables to “s”.
ac/nir/ngg: Remove some superfluous variables.
ac/nir/ngg: Create separate variable for repacked rel_patch_id.
ac/nir/ngg: Rename repacked variables to clarify their name.
ac: Add more defines for mesh shading packets.
radv: Use new mesh shading packet defines.
radv: Add per-prim attributes to ring_attr stride.
radv: Use per-prim params in has_param_exports.
radv: Add extra offset to per-prim params.
radv: Use PRIM_ATTR for PS inputs on GFX11.
radv: Include per-prim params in NUM_INTERP on GFX11.
radv: Adjust mesh draw packets for GFX11.
ac/nir/ngg: Clarify mesh shader scratch ring.
ac/nir/ngg: Use attribute ring for mesh shader params.
ac/nir/ngg: Split legacy workgroup index function.
ac/nir/ngg: Fix mesh shader layer on GFX11.
ac/nir/ngg: Store special MS outputs in attribute ring for PS to read.
radv: Enable mesh shading on GFX11.
radv: Fix swizzled VS input loads when some components are unused.
radv: Don’t expose NV_mesh_shader and don’t use it in CI.
radv: Remove NV_mesh_shader API entrypoints.
radv: Remove first_task and ib_addr/ib_stride.
radv: Clean up emitting zero mesh shader draw id.
ac/nir/ngg: Remove NV_mesh_shader support.
ac/nir: Remove ac_nir_apply_first_task_to_task_shader.
nir: Remove IB address and stride intrinsics.
radv: Move radv_nir_* to a new folder.
radv: Move radv_nir_lower_primitive_shading_rate to new file.
radv: Move radv_nir_lower_fs_intrinsics to new file.
radv: Move radv_nir_lower_intrinsics_early to new file.
radv: Move radv_nir_lower_view_index to new file.
radv: Move radv_nir_lower_viewport_to_zero to new file.
radv: Move radv_nir_export_multiview to new file.
radv, ac/nir: Move sin/cos lowering to a common pass.
radv: Move I/O lowering functions into a new file.
radv: Use radv_get_shader to get vertex shader when binding pipeline.
ac/nir/ngg: Slightly improve attribute ring offset calculation.
ac/nir: Store only lowest 8 bits for task draw ring DWORD3.
ac/nir: When task->mesh dispatch Y or Z are 0, also set X to 0.
aco: Consider p_cbranch_nz as divergent branch too.
aco: Don’t remove exec writes that also write other registers.
aco: Simplify get_phi_operand using Operand::c32_or_c64.
aco: Don’t verify branch exec read when eliminating exec writes.
aco: Pop branch operands when targets are same in SSA elimination.
aco: Call dominator_tree before lower_phis.
aco: Better phi lowering for merge block when else-side is const.
nir: Gather compile time constant task->mesh dispatch size.
radv: Use linear_dispatch info in GFX11 task/mesh draw packet.
radv/amdgpu: Extract CS chain and unchain functions.
radv/amdgpu: Expose CS chain and unchain on the winsys.
radv/amdgpu: Extract radv_amdgpu_add_cs_to_bo_list function.
radv/amdgpu: Remember which CS the current one is chained to.
radv/amdgpu: Walk chained CS objects for BO list.
radv/amdgpu: Unchain CS array in queue code not in winsys.
radv: Chain cmd buffers in queue code when possible, not in winsys.
radv/amdgpu: Remove can_patch and chained submit code path.
ac/llvm: Cover runtime 0 in GFX10 gs_alloc_req workaround.
aco: Fix optimization of v_cmp with subgroup invocation.
aco: Don’t use nir_selection_control in aco_ir.
aco: Only include nir.h in instruction selection.
radv: Don’t include nir.h in radv_shader.h
radv: Create continue preamble on GFX6 even when no shader rings are used.
ac: Add maximum number of submitted IBs.
radv/amdgpu: Fix mismatching return type of radv_amdgpu_cs_submit.
radv/amdgpu: Only allow IB BOs on graphics and compute queues.
radv/amdgpu: Use correct alignment when creating CS BOs.
radv/amdgpu: Extract radv_amdgpu_cs_add_old_ib_buffer.
radv/amdgpu: Add a few assertions during submit.
radv/amdgpu: Remove hw_can_chain in favour of use_ib.
radv/amdgpu: Rewrite fallback code path so it can split submissions.
radv/amdgpu: Allow multiple continue preambles.
radv/amdgpu: Add continue preambles to fallback submit.
radv/amdgpu: Add postambles to fallback submit.
radv/amdgpu: Add ability to submit non-chained CS to fallback.
radv/amdgpu: Split gang submissions correctly when not chained.
radv: Fill continue preambles and postambles properly.
radv: Split submission in winsys instead of radv_queue.
radv/amdgpu: Use fallback submit for queues that can’t use IBs.
radv/amdgpu: Clean up submission functions.
radv/amdgpu: Respect maximum number of submitted IBs per IP type.
radv: Allow task/mesh shaders with RADV_DEBUG=noibs.
radv/amdgpu: Add bool is_secondary argument to cs_create function.
radv/amdgpu: Extract radv_amdgpu_cs_bo_create function.
radv/amdgpu: Place secondary CS without IB2 in non-WC GTT.
ac, aco, radv: Clarify LDS size on GFX6, and NGG shaders.
radv: Don’t hardcode LDS granularity in gfx9_get_gs_info.
aco: Remove setup_*_variables and add setup_lds_size instead.
aco, radv: Remove “key” from aco_compiler_options.
aco, radv: Remove redundant enable_mrt_output_nan_fixup from PS epilog info.
ac/nir/ngg: Don’t store primitive IDs from culled primitives.
aco: Disallow constant propagation on SOPP and fixed operands.
Tomeu Vizoso (7):
android: Make libbacktrace optional again
android: Cleanup unneeded headers from the sync stub
ci: Build for Android with libbacktrace=false
ci: Use NDK 25b to build for the Android ABI level 33
etnaviv: handle missing alu conversion opcodes
etnaviv: print writemask of store operations
etnaviv: don’t read too much from uniform arrays
Turo Lamminen (4):
radv: Change radeon_cmdbuf counters to uint64_t to make alias analysis optimize radeon_emit better
radv: Clean up variables in si_get_ia_multi_vgt_param
radv: Avoid redundant fetch of radv_device
radv: Optimize emitting prefetches
Val Packett (1):
mailmap: Remap name and email for Val Packett
Vincent Davis Jr (1):
gbm/backend: fix gbm compile without dri
Vinson Lee (2):
radv: Fix memory leak.
pps: Fix build errors.
Vitaliy Triang3l Kuzmin (2):
radv: Set DB_Z_INFO.NUM_SAMPLES to MSAA_EXPOSED_SAMPLES without Z/S
r600: Alpha to coverage dithering on Evergreen+
Väinö Mäkelä (12):
intel/vec4: Set the rounding mode
intel/vec4: Don’t optimize multiply by 1.0 away
hasvk: Don’t claim shaderDenormPreserveFloat32 on gfx7
hasvk: Tell spirv_to_nir float controls are always supported
hasvk: Enable PixelShaderKillsPixel when omask is used
hasvk: Mark VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL as stencil write optimal
hasvk: Handle subpass self-dependencies for stencil shadow copies
ci/intel: Update hasvk HSW xfails
hasvk: Disable non-zero fast clears for 8xMSAA images
intel/ci: Remove outdated hasvk copy_and_blit xfails
intel/ci: Remove hasvk xfails missing from the CTS
intel/ci: Remove skipped float_controls tests from hasvk xfails
X512 (3):
hgl: remove
haiku: fix build
EGL: implement Haiku driver over Gallium
Yevhenii Kolesnikov (2):
nir/loop_analyze: Track induction variables incremented by more operations
nir/loop_analyze: Determine iteration counts for more kinds of loops
Yiwei Zhang (46):
venus: log upon device creation
venus: lazily query and cache gralloc front rendering usage
venus: disable non AHB external memory bits on Android
venus: fix formating
venus: fix tracing init to include instance creation
venus: render server enforces blob_id_0
venus: move exp features init back to use ring submit
venus: further disallow sparse resource
venus: replace binary search with hardcode for max buffer size
venus: start requiring all experimental features
venus: clean up memoryResourceAllocationSize
venus: clean up globalFencing
venus: refactor sync fd fence and sempahore features
venus: tighten up the sync fd requirements for Android wsi
venus: distinguish external memory from mappable memory support
venus: fix external buffer creation
venus: remove redundant abstractions for wsi struct search
venus: refactor image create info pnext tracking
venus: simplify ahb image creation
venus: simplify support for non-AHB external images
venus: fix external image creation
venus: fix device memory export alloc info
venus: fix VK_EXT_image_view_min_lod feature query
venus: ensure invariance of buffer memory requirement size
venus: sync to latest protocol for ring status enum
venus: abort ring submit when ring is in fatal status
venus: propagate vn_ring to vn_relax
venus: vn_relax to abort on ring fatal status upon warn order
venus: revert back the warn order
venus: sync to latest protocol for asyncRoundtrip
venus: switch to use 64bit roundtrip seqno
venus: make vn_instance_wait_roundtrip asynchronous
venus: let vn_instance_submit_command track ring seqno
venus: make common wsi bo submission async
venus: refactor to add vn_sync_payload_external
venus: make external fence and semaphore export async
Revert “zink/kopper: Add extra swapchain images for Venus”
venus: sync latest protocol for layering extensions
venus: add VK_EXT_load_store_op_none support
venus: add VK_EXT_rasterization_order_attachment_access support
venus/docs: sync to latest venus supported extensions
venus: requires asyncRoundtrip
venus: requires ringMonitoring
venus: move exp feature init back to ring and remove unused function
venus: forward ARM driverVersion for ANGLE workarounds
radv: respect VK_QUERY_RESULT_WAIT_BIT in GetQueryPoolResults
Yogesh Mohan Marimuthu (18):
egl: add render_gpu tag to dri2_dpy->fd and dri2_dpy->dri_screen variable
loader,glx: add render_gpu tag psc->driScreen and psc->fd
loader,glx,egl,vl,d3d: loader_get_user_preferred_fd() function to return original_fd
egl: remove is_different_gpu variable from struct dri2_egl_display
glx: remove is_different_gpu variable from struct dri_screen
loader,glx,egl: remove is_different_gpu variable from loader
ac,radeonsi: move shadow regs create ib preamble function to amd common
radv: add shadowregs variable to RADV_DEBUG environment variable
radv: add support for register shadowing
radv: set preemp flag and pre_ena bit for shadowregs
radv: INDEX_TYPE and NUM_INSTANCES PKT3 are not shadowed
radv: fence complete struct is 4 qw size
radv: allow NULL initial_preamble_cs in radv_amdgpu_winsys_cs_submit_sysmem()
radeonsi: remove some shadow reg optimization for bf1 game
wsi/display: check alloc failure in wsi_display_alloc_connector()
ac/surface: only adjust pitch if surf_pitch was modified
amd/surface: add RADEON_SURF_NO_TEXTURE flag
radv: set RADEON_SURF_NO_TEXTURE flag in radv_get_surface_flags()
Yogesh Mohanmarimuthu (7):
egl: add fd_display_gpu to struct dri2_egl_display
egl,egl/x11: keep display fd open for prime
egl: create DRI screen for display GPU in case of prime
loader,glx,egl/x11: init dri_screen_display_gpu in struct loader_dri3_drawable
egl/wayland: keep display fd open for prime
loader: make image_format_to_fourcc() non-static
egl/wayland: for prime, allocate linear_copy from display GPU VRAM
Yonggang Luo (8):
util: Implement util_iround with lrintf unconditionally
util: Fixes error: no previous prototype for ‘mesa_cache_db_entry_remove’ Fixes: c92c99481fd (“util/mesa-db: Support removal of cache entries”)
vulkan: Use static_assert for check HWVULKAN_DISPATCH_MAGIC == ICD_LOADER_MAGIC
meson: Split c_cpp_args from pre_args
meson: Combine duplicated c_args and cpp_args
meson: When sse2 enabled, both c and cpp using sse2 options
meson: Split sse2_arg and sse2_args out of c_cpp_args
meson: Use sse2_arg and sse2_args to replace usage of c and c_sse2_args
Yusuf Khan (2):
nvc0/nv50: support and enable EXT_memory_object*
gallium: create query_memory_info implementation for sw drivers
Yuxuan Shui (1):
loader: unregister special event in loader_dri3_drawable_fini
antonino (54):
zink: fix line smooth lowering
zink: add `zink_emulate_point_smooth` driconf
zink: add `lower_point_smooth` to `zink_fs_key`
zink/nir_to_spirv: add support for `nir_intrinsic_load_point_coord`
nir: handle output beeing written to deref in `nir_lower_point_smooth`
zink: handle point_smooth emulation
drirc: set `zink_emulate_point_smooth` for Quake II
zink: fix stipple pattern in oblique lines
zink: fix `final_hash` update in `zink_gfx_program_update`
mesa: correctly allocate space for converted primtives
gallium: decompose quad strips into quads if supported
zink: handle switching between primitives
nir: handle primitives with adjacency
nir: avoid generating conflicting output variables
nir: calculate number of vertices in nir_create_passthrough_gs
nir: handle edge flags in nir_create_passthrough_gs
zink: add `has_edgeflags` flag to zink_shader and zink_gfx_program
zink: handle edgeflags
nir: allow to force line strip out in nir_create_passthrough_gs
zink: force line strip out when emulating stipple
zink: filled quad emulation gs generation function
zink: add `zink_rast_prim` enum
zink: handle quads
zink: fix flat shading on filled quads
zink: add flags to `zink_gfx_program` and `zink_context`
zink: add `needs_inlining` to `zink_shader`
zink: implement flat shading using inlined uniforms
nir/zink: handle provoking vertex mode in `nir_create_passthrough_gs`
zink: handle provoking vertex mode for filled quads
nir: keep xfb properties in nir_create_passthrough_gs
zink: keep xfb properties in quad emulation gs
zink: advertise support for the quad primitive
zink: prevent crash when freeing
zink: unified `zink_set_primitive_emulation_keys` and `zink_create_primitive_emulation_gs`
zink: zink: add `parent` to `zink_shader::non_fs`
zink: improve generated gs unbinding
zink: unbind generated gs in `bind_last_vertex_stage`
zink/ci: remove `primitive-id-no-gs-quads` from radv-vangogh-fails
nir: only handle flat interpolation when needed in `nir_create_passthrough_gs`
zink: simplify logic to call `zink_set_primitive_emulation_keys`
zink: add field to ‘zink_gs_key’ and enum
zink: add provoking vertex mode lowering
zink: always advertize provoking vertex mode support
zink: update requirements now that pv mode can be emulated
zink: add `descriptor_bindless_id` to `zink_shader_info`
zink: fix sampler array collision in `nir_to_spirv`
zink: don’t emulate edgeflags for patches
zink: use correct primitives for passthrough gs with tess
zink: fix pv mode lowring index calculation
zink: use ring buffer to preserve last element
zink: fix exit condition on pv emulation loop
zink: fix line strip offsets in pv mode emulation
zink: fix store subsitution in `lower_pv_mode_gs_store`
zink: take location_frac into account in pv emulation
driver1998 (1):
gallium: Use DETECT_OS_WINDOWS instead of ‘WIN32’
osy (1):
virgl: enable timer queries only if host supports it
t0b3 (1):
nir/nir_opt_move: fix ALWAYS_INLINE compiler error
volodymyr.o (1):
mesa ctx->API –> _mesa_is_foo(ctx)
xurui (2):
panfrost: Check the return value of drmGetVersion
zink: bs->dd.push_pool[1].pool should be freed