Mesa 24.0.0 Release Notes / 2024-02-01¶
Mesa 24.0.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 24.0.1.
Mesa 24.0.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.
Mesa 24.0.0 implements the Vulkan 1.3 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.
SHA256 checksum¶
dc7e8c077bc5884df95478263b34bdebb7e88e600689cb56fb07be2b8c304c36 mesa-24.0.0.tar.xz
New features¶
VK_EXT_image_compression_control on RADV
VK_EXT_device_fault on RADV
OpenGL 3.3 on Asahi
Geometry shaders on Asahi
GL_ARB_texture_cube_map_array on Asahi
GL_ARB_clip_control on Asahi
GL_ARB_timer_query on Asahi
GL_EXT_disjoint_timer_query on Asahi
GL_ARB_base_instance on Asahi
OpenGL 4.6 (up from 4.2) on d3d12
VK_EXT_depth_clamp_zero_one on RADV
GL_ARB_shader_texture_image_samples on Asahi
GL_ARB_indirect_parameters on Asahi
GL_ARB_viewport_array on Asahi
GL_ARB_fragment_layer_viewport on Asahi
GL_ARB_cull_distance on Asahi
GL_ARB_transform_feedback_overflow_query on Asahi
VK_KHR_calibrated_timestamps on RADV
VK_KHR_vertex_attribute_divisor on RADV
VK_KHR_maintenance6 on RADV
VK_KHR_ray_tracing_position_fetch on RADV
EGL_EXT_query_reset_notification_strategy
Bug fixes¶
vlc crashes when playing 1920x1080 video with Radeon RX6600 hardware acceleration and deinterlacing enabled.
[radeonsi] Regression: graphical artifacting on water texture in OpenGOAL
Assertion when creating dmabuf-compatible VkImage on Tigerlake
VAAPI: EFC on VCN2 produces broken H264 video and crashes the HEVC encoder
[AMDGPU RDNA3] Antialiasing is broken in Blender
MTL: vulkan cooperative matrix tests gpu hang on MTL
Assassin’s Creed Odyssey wrong colors on Arc A770
The Finals fails to launch with DX12 on Intel Arc unless “force_vk_vendor” is set to -1.
VA-API CI tests freeze
radv: games render with garbage output on RX5600M through PRIME with DCC
radv: RGP reports for mesh shaders are confusing
zink crashes on nvidia
d3d10umd: Build failure regression with MSVC during 23.3 development cycle
Error during SPIR-V parsing of OpCopyLogical
rusticl: fails to find SPIRV-Tools headers via pkg-config under non-default prefix
Conservative depth output doesn’t work with RADV
RADV: DOA-X3 (yuzu) missing hair, eyes and skybox
intel: Require 64KB alignment when using CCS and multiple engines
radv: Atlas Fallen corrupted rendering
r300: nir pass to lower indirect regression
r300: LRP present even with .lower_flrp32=true
23.3.2 regression: kms_swrast_dri.so segfaults
Radeon: YUYV DMA BUF eglCreateImageKHR fails
No support for a644
anv: importing memory for a compressed image using modifier is hitting an assert
anv: importing memory for a compressed image using modifier is hitting an assert
anv: importing memory for a compressed image using modifier is hitting an assert
Large regression in `glbench –tests context` on Intel
Android 14 depends on Vulkan EXT_swapchain_maintenance1, which breaks radv
nvk,nak: Implement shaderFloat64
Mesa is not compatible with Python 3.12 due to use of distutils
anv: glcts regression on zink
nir: Trivial loop not unrolling
Possible regression with AMD GPU with flatpak apps
nvk,nak: Implement VK_KHR_vulkan_memory_model
Compiling Mesa with X in custom prefix fails in Intel Vulkan driver
anv: implement recommended AUX-TT invalidation on compute/transfer queues
anv: implement recommended AUX-TT invalidation on compute/transfer queues
!26307 broke some piglit tests with rusticl on radeonsi on Navi 14
Compute shader with imageStore() to a swapchain image (from a display surface) produces incorrect results (Raspberry, Vulkan).
nvk: Implement VK_EXT_multi_draw
radv/aco: Crysis 2 Remastered RT reflections are blocky around the edges with ACO, renders normally with LLVM
radv: Major regression in main branch causing all Vulkan apps to crash on 6600M (Navi 23)
[23.3.0] Parallel build failure - fatal error: vtn_generator_ids.h: No such file or directory
crocus: Assertion failures in NIR divergence analysis
nak: Implement nir_op_fmulz
nvk,nak: Implement VK_KHR_shader_float_controls
748b7f80ef1cf6a3fed9991d70230e69fef51a0e - Regression on Doom Eternal w/ RT Reflections
glFlush() blocks until close to GPU completion on Radeon R9 270
nvk: Implement VK_EXT_texel_buffer_alignment
rusticl: fails to find X11 headers via pkg-config under non-default prefix
nvk,nak: Implement VK_EXT_shader_image_atomic_int64
nvk,nak: Implement VK_KHR_shader_atomic_int64
nvk,nak: Implement VK_KHR_shader_subgroup_extended_types
nvk,nak: Implement shaderInt64
nvk: Implement VK_EXT_subgroup_size_control
mesa:freedreno / afuc-disasm unit test failure
anv: Resident Evil 2 hang
Mesa 23.3.0 release build fails on 22.04 LTS
Segfault in SDL2 game when using environment variables: `SDL_VIDEODRIVER=wayland DRI_PRIME=1`
Mesa 22.3.0 SEGFAULT in nir shader creation for r600 cards on FreeBSD
radeonsi: merge request 26055 causes thousands of piglit failures
iris: INTEL_COMPUTE_CLASS causes gpu hangs on MTL platforms
anv: piglit tests regressed for zink
aco,radeonsi: GFX11 dEQP-GLES31.functional.separate_shader.random.0 fail when AMD_DEBUG=useaco
crash in si_update_tess_io_layout_state during _mesa_ReadPixels (radeonsi_dri, mesa 23.2.1)
Compilation error with current LLVM git (createLoopSinkPass)
[RADV] War Thunder has some grass flickering.
radv: satisfactory broken shader
RADV problem with R7 M440 in some games
nvk,nak: Weird fog effect in old GTA games with DXVK
gpu driver crashes when opening ingame map playing dead space 2023
[anv] Valheim water misrendering
radv, zink: dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component16 fails on gfx9
Armored Core 6 (1888160) fake_sparse support
radv: fix sparseResidencyImage3D on GFX8
build still broken on Slackware 15.0 i586
mesa fails to build on arch
EGL/v3d: EGL applications under a X compositor doesn’t work
nvk,nak: Implement VK_KHR_fragment_shader_barycentric
RADV: trunc_coord breaks ambient occlusion in Dirt Rally and other games
radv: Mass Effect Legendary Edition: a line going across the screen is visible in some areas with Ambient Occlusion enabled
LTO-related build failures
anv: DIRT5 gfx11_generated_draws_spv_source triggers “assert(!copy_value_is_divergent(src) || copy_value_is_divergent(dest));”
nvk: Implement VK_KHR_synchronization2
nvk: Implement bufferDeviceAddressCaptureReplay
nvk,nak,codegen: Implement VK_KHR_pipeline_executable_properties
panfrost: gbm_bo_get_offset() wrongly returns 0 for second plane of NV12 buffers
Sastisfactory since Update 8 needs force_vk_vendor set
[RADV][TONGA] - BeamNG.drive (284160) - Artifacts are present when looking at the skybox.
LEGO Star Wars: The Skywalker Saga graphical glitches (DXVK) on R9 380
[radv] Crypt not rendering properly
Leaks of DescriptorSet debug names
[Tracing flake] Missing geometry in trace@freedreno-a630@freedoom@freedoom-phase2-gl-high.trace
Unreal Engine 5.2 virtual shadow maps have glitchy/lazy tile updates
RADV: Visual glitches in Unreal Engine 5.2.1 when using material with anisotropy and light channel 2
radv: Regression with UE5 test
SIGSEGV with MESA_VK_TRACE=rgp and compute only queue
mesa: vertex attrib regression
[ANV] Corruptions in Battlefield 4
anv regression w/ commit e488773b29d97 (“anv: Fast clear depth/stencil surface in vkCmdClearAttachments”)
freedreno uses wrong patch size
ir3: dEQP-GLES31.functional.synchronization.inter_invocation.image_atomic_read_write crash on a6xx gen4
a630: antichamber crashes with pack_A6XX_GRAS_CL_GUARDBAND_CLIP_ADJ: Assertion
mesa:amd+compiler / aco_tests assembler.gfx11.vop12c_v128/gfx11 failure with llvm-17
ci_run_n_monitor crash because of incorrect parsing of dag
Zink + Venus: driver can’t handle INVALID<->LINEAR!
anv not initializing engine correctly with INTEL_COPY_CLASS=1
Anv: Particles have black square artifacts on Counter Strike 2 on Skylake
Lords of the Fallen 2023 Red Eye mode crashing game and desktop
[radeonsi] [vulkan] [23.3-rc1 regression] Video output corrupted in QMplay2 with Vulkan renderer
[BISECTED] ac/radeon commit somehow breaks nv12 surface from HEVC decode
radv: Chrome crashes when ANGLE uses GPL
Parsec displays completely green screen with hardware decoder selected while using Mesa 23.3 and Mesa 24
H264 to H264 transcode output corruption with gst-vaapi
opencl-jpeg-encoder does not work with nouveau/rusticl, works with nouveau/clover
[rusticl] [radeonsi] [darktable4] [ppc64le] Darktable always renders black images despite not throwing any error
[R600] X-plane 11 demo (Linux Native) crashes upon launch on HD5870 and HD6970
[CI] .gitlab-ci/setup-test-env.sh date -d parsing fails on Alpine Linux containers
ANV not handling VkMutableDescriptorTypeCreateInfoEXT::pMutableDescriptorTypeLists[i] being out of range
Ubuntu 23.10 build error with rusticl_opencl_bindings.rs
Rusticl fails to build
tu: Wolfenstein: The New Order misrenders on a740
DRI_PRIME fails with ACO only radeonsi
ci_run_n_monitor: undetected sanity dep breaks the pipeline
Changes¶
Alejandro Piñeiro (10):
broadcom/qpu: use back BITFIELD64_RANGE for ANYOPMASK
broadcom/compiler: add v3d_pack_unnormalized_coordinates helper
broadcom: only support v42 and v71
broadcom/compiler: set properly lod query
broadcom/cle: remove v33 and v41 from xml definition
broadcom/cle: rename xml files
docs/v3d: update v3d documentation
nir: add new opcodes to map new v71 packing/conversion instructions
broadcom/compiler: update image store lowering to use v71 new packing/conversion instructions
broadcom/compiler: remove one superfluous call to nir_opt_undef
Alessandro Astone (2):
asahi: Use the compat version of qsort_r
zink: Fix resizable BAR detection logic
Alexander von Gluck IV (3):
egl/haiku: Cleanup includes; minor build fix
hgl: Redefine visual options in hgl_context.h
egl/haiku: Remove some dead cleanup code
Alyssa Rosenzweig (286):
hasvk: Support builiding on non-Intel
crocus: Support building on non-Intel
meson: Add vulkan-drivers=all option
meson: Add gallium-drivers=all option
gitlab: Highlight .cl as C
nir,vtn: Add exported bool to nir_function
nir: Add nir_remove_non_exported
nir/builder: Add nir_call helper
meson: Simplify clc expression
meson: Require clc for asahi
vtn: Add spirv_library_to_nir_builder feature
clc: Add missing idep_vtn
agx: Fix lower regular texture metadata
agx: Vectorize load/stores
agx: Fuse (unmasked) extr_agx
agx: Fuse ubitfield_extract
asahi: Fix agx_pack unrolling
asahi: Make GenXML compatible with OpenCL
asahi: Unpack at 32-bit granularity
asahi: Reexpress genxml pack macro
asahi: Add folder for internal shaders
asahi: Add asahi_clc infrastructure
asahi: Pass valid memctx to open_device
asahi: Deserialize libagx when opening device
asahi,agx: Plumb libagx
asahi: Add software-defined field to texture desc
agx: Use CL for texture lowerings
asahi: Remove placeholder shader
asahi: Fix tools=all builds
ci: Opt out asahi from clang-format
ttn: Set sample shading for sample ID reads
compiler: Make shader_enums.h CL-safe
compiler: Inline mesa_vertices_per_prim
compiler: Make u_decomposed_prims_for_vertices available to CL
nir/lower_gs_intrinsics: Include primitive counts
nir/lower_gs_intrinsics: Append EndPrimitive
nir/lower_gs_intrinsics: Count decomposed primitives too
nir: Also gather decomposed primitive count
nir: Add intrinsics for lowering GS
nir: Add intrinsics for lowering bindless textures/samplers
nir/print: handle adjacency
asahi: Clamp 8-bit integer RTs
agx: Legalize image MS index
agx: Fix fragment side effects scheduling
agx: Check for spilling in release builds
docs/features: Mark ARB_mdi done on asahi
agx: Cleanup 8-bit math before lowering
agx: Require 32-bit alignment for EOT offset
agx: Add scaffolding for subgroup ops
agx: Translate simple subgroup ops
asahi: Pack non-border colour sampler desc
agx: Allow drivers to lower texture handles
asahi: Lower samplers to bindless if needed
agx: Lower LOD bias earlier
agx: Handle bindless samplers
asahi: Handle load_sampler_handle
asahi: Add sampler heap data structure
asahi: Use the sampler heap
asahi: Upload tex/samplers properly with merged shaders
asahi: Don’t hazard track fake resources
asahi: Refactor encoder data structure
asahi: Factor out agx_launch
asahi: Make encoder_allocate public
asahi: Add data structures for geometry shaders
asahi: Add helpers for lowering GS
asahi: Add GS lowering pass
asahi: Wire up geometry shaders
asahi: Advertise geometry shaders
asahi: rm unused deqp debug flag
asahi: Don’t use OpenGL clip bit
asahi: Plumb clip_halfz bit from RS
asahi: Advertise ARB_clip_control
asahi: Implement timer queries
docs: Mark timer queries as done on asahi
asahi: Implement ARB_base_instance
nir: Simplify nir_alu_instr_channel_used definition
nir/validate: Optimize ssa_srcs set
nir/validate: Don’t spam nir_alu_instr_channels
nir/validate: Don’t validate out-of-bounds channels
nir/validate: Use unlikely for validate_assert
nir/validate: Don’t check dimensions in validate_def
nir/validate: Drop stale todo
nir/validate: Inline validate_ssa_src
nir/validate: Split out validate_sized_src
nir/validate: Specialize if source validation
panfrost: Add an allow_rotating_primitives() helper
panfrost: Factor out vertex attribute stride calculation
panfrost: Add panfrost_get_{position,varying}_shader() helpers
gallium: add pipe_shader_from_nir helper
radeonsi: use pipe_shader_from_nir
v3d: use pipe_shader_from_nir
asahi: use pipe_shader_from_nir
vc4: use pipe_shader_from_nir
zink: use pipe_shader_from_nir
nouveau: use pipe_shader_from_nir
panfrost: use pipe_shader_from_nir
gallium: drop pipe_shader_state_from_nir
mesa/st: collapse tgsi deadcode
mesa/st: use pipe_shader_from_nir
nir/lower_tex: Add 1D lowering
agx: fix 1D texture sampling
ac,radv,radeonsi: use common 1D texture lowering
nir/format_convert: handle clamping smaller bit sizes
nir/lower_idiv: Optimize idiv sign calculation
agx: Hotfix for stack_adjust in GS
asahi/decode: Decode multiple macOS commands
asahi: Quiet clang warning
asahi: Add half float type to genxml
asahi: Add XML for hw tessellation
asahi: Identify Primitive ID frag input
asahi: Identify bicubic filtering mode
asahi: fix index bias with GS/XFB
asahi: Sync heap size
asahi: init clear colour between batches
asahi: clamp clear colours
asahi: handle self blits
asahi: bump limits
asahi: remove bogus assertion
asahi: be robust about null xfb
asahi: fix dirty tracking fail with point sprites
asahi: handle null PBE
asahi: Be robust with arrays of images
asahi: fix imageSize of null image
asahi: rm compact image atomic descriptors
asahi: use 2D descriptors for cubes
asahi: defer texture packing to draw-time
ail: handle >4GiB textures
asahi: return GL_OOM for excessive image sizes
asahi: fix meta usc builder allocation
asahi: implement xfb stream queries
asahi: fix output to non-rast streams
asahi: bump glsl version
asahi: minify when blitting for transition
asahi: blit with the old format when transitioning
asahi: flush before resource transition
agx: Fix flatshading of matrices
asahi: fix xfb of pointsize when not drawing points
asahi: defeature quads
asahi: Rotate tri fans based on provoking vtx
asahi: use GS for first-provoking fans
asahi: Early out for GS + rast discard
asahi: Implement draw parameters
agx: wire up texture_samples/image_samplers
asahi: advertise ARB_shader_texture_image_samples
asahi: fix layout transitions with arrays
asahi: use correct target packing PBE
asahi: choose staging bind better
asahi: fix destroy_query leaving dangling references
asahi: add agx_push macro
asahi: collapse unreachable condition
asahi: use agx_push
asahi: remove dead declarations
asahi: rm unnecessary uniform upload for GS
asahi: make UB easier to see
asahi: force GS for indirect prim gen query
asahi: rework GS input assembly
asahi: Implement multidraw indirect
asahi: move heap alloc to first use
asahi: double depth bias
asahi: add static assert
agxdecode: fix stack smash with border colour
asahi: Support L/A/I formats for texture buffers
asahi: fix tri fan enum
asahi: rework cf binding xml
asahi: add xml for flatshading fans
agx: fix VARYING_SLOT_COL0 getting flatshaded
agx: Avoid scratch mem with tri strip w/ adjacency
agx: rework libagx linking a bit
asahi: Unroll GS/XFB primitive restart on the GPU
asahi: Lower edge flags
asahi: assert hw invariant
asahi: rewrite pointsize handling
agx: remove spurious z/s writes in force early-z shaders
agx: handle force early-z + discard
agx: note that sample_mask runs occlusion queries
agx: allocate varying slot if writing viewport only
agx: report if we have a nonzero viewport
asahi: allow empty scissor box
asahi: add XML for multiple viewports
asahi: Implement ARB_viewport_array
asahi: handle some components/offsets in GS lowering
asahi: prepare gs copy shaders for compact clip/cull
asahi: handle compact clip/cull in gs component gather
asahi: Implement ARB_cull_distance
asahi: add more BGR formats
asahi: fix dupe rgb65 formats
asahi: fix pbe swizzling
asahi: fix integer RT clamping
agx: fix fp64 lowering options
agx: Lower 64-bit I/O to 32-bit
agx: don’t produce split of immediate
asahi: fix size calculation for 2d msaa arrays
asahi: allow more format reinterpretation
asahi: respect render condition for compute
asahi: wire up hardware gl_PrimitiveID
asahi: clamp draw count for mdi
gallium: fix util_clamp_color type confusion
gallium: add PIPE_IMAGE_ACCESS_DRIVER_INTERNAL
nir/validate: allow bias on nir_texop_lod
asahi: Implement lod queries
vtn: fuse OpenCL mad if we can can
asahi: fix eMRT + background load interaction
ail: add is_level_compressed query
ail: use is_level_compressed
ail: add ail_is_level_twiddled_uncompressed
asahi: do not use compression blits for uncompressed levels
agx: allow bindful arrays if not clamping
asahi: don’t format convert with staging blits
asahi: implement arrays as 2d for internal images
asahi: respect last_block
asahi: allow compressed image stores in blits
asahi: fix image_mask with unbind num trailing
asahi: add compute blitter
asahi: add and use batch_is_compute helper
asahi: fix get_batch with compute batches
asahi: allow multiple compute dispatches in a batch
asahi: drop custom mipmap generate
asahi: set data_valid on first draw
asahi: fix data valid tracking
asahi: reduce transfer map flushing with staging blits
asahi: do not stall for writers with invalid mips
asahi: implement blit-based resource_copy_region
asahi: fix snorm staging blits
asahi: use copy region for decompression
asahi: fix scissor arrays
asahi: disable compute-based blitter for now
agx: use more mem->tex barriers even on g13g
agx: fix early-z + discard together
asahi: fix set_sampler_views
asahi: fix max tex sizes
agx: optimize fcmp like fcmpsel
agx: wire up some ballots
agx: lower votes to ballots
agx: implement query_levels
agx: skip scoreboard bit in builder for !wait
agx: make vec widths explicit in IR
agx: validate post-RA
agx: rm silly todo
agx: rm outdated comment
agx: add index size helper
agx: trust in agx_index size
agx: mv agx_read/write_regs to validator
agx: use custom assert when packing
agx: use mov imm for pcopies
agx: allow phis with 16bit imms
agx: prepare for immediates in phis
agx: handle imm inlining into phis
asahi: rework compute emptiness tracking
asahi: stub qbo on the cpu
asahi: implement xfb overflow queries
agx: const fold after discard lowering
agx: fix xfb of invalid comp
agx: fix xfb of invalid var
asahi: bump vertex shader outputs
asahi: rm pointless multisample key bit
asahi: rm layered bit from shader key
asahi: implement point sprites w/o shader key
asahi: rm unused blend enable bit
asahi: rm logicop enable bit
asahi: rm nr_cbufs from key
asahi: rm blend->store from shader key
asahi: rm vbuf.count from key
asahi: rm agx_vbufs wrapper
asahi: invert program_point_size
asahi: divide by xfb stride for xfb draws
asahi: disable fp16 cbuf cap
asahi: add missing GS line strip (+adj) handling
asahi: link libagx before lowering mem access widths
asahi: cl-ify some xfb logic
asahi: factor out libagx_map_vertex_in_tri_strip
asahi: rotate xfb’d tri strips
asahi: inline something silly
asahi: plumb get_ubo_size
asahi: make txf robust properly
asahi: fix passthrough GS with poly modes
asahi: add missing tib alignment check
agx: optimize split(64-bit uniform)
agx: expand agx_index
agx: fix 64-bit phis with inlined immediates
agx: add unit test for pcopy lowering bug
agx: require min alignment for load/store vectorize
asahi: fallback some resource copies
asahi: don’t canonicalize nans/flush denorms when copying
agx: unit test split uniform opt
agx: clang-fmt
nir,zink: Redefine flat_mask in terms of I/O locations
Andrew Gazizov (4):
venus: Add use_guest_vram capset to enable guest-based blob alloc
venus: Use vk_object_id as blob_id for guest_vram device memory alloc
venus: Tighten the conditions for guest_vram device memory alloc
venus: Make sure that guest allocated blobs from hostmem are mappable
Anthony Roberts (1):
glsl: Use unsigned instead of enum type in ir_variable_data
Antoine Coutant (1):
clc: retrieve libclang path at runtime.
Antonio Gomes (14):
rusticl, meson: Move libc functions to their own crate
rusticl, meson: Add gl/egl/glx bindings
iris: Fixups in resource_get_handle and resource_from_handle
mesa/st: Add new data to mesa_glinterop
mesa/st, dri2, wgl, glx: Modify flush_objects interop func to export a fence_fd
rusticl: Add xplat helpers to dynamic link interop functions
rusticl/device: Function to check for gl interop support
rusticl/device: Enable gl_sharing only if create_fence_fd is implemented
rusticl: Add functions to create CL ctxs from GL, and also to query them
rusticl/format: Add conversion table for GL->CL
rusticl: Create CL mem objects from GL
rusticl: Add support for cube maps
rusticl: Flush objects just before importing them
rusticl: Advertise cl_khr_gl_sharing extension
Anuj Phogat (1):
intel/l3: Adjust URB weight calculation for gfx12.5+.
Asahi Lina (12):
asahi: Fix CDM Launch/Barrier naming
asahi: Add extra CDM barrier bit for G13X
asahi: Move USC cache flush to agx_batch_init_state
asahi: Add more memory barrier opcodes
asahi: Add extra barrier for texture atomics on G13X
ail: Fix miptree offset generation for compressed textures
ail: Add explicit specification of mip level strides
ail: Fix tile size & strides for compressed textures
asahi: Add .editorconfig for CL files
asahi: Implement BO alignment
agx: Fix packing of stack map/unmap
agx: Add scoreboarding to stack instructions
Bas Nieuwenhuizen (11):
radv: Add DGC preprocessing barrier support.
radv: Add compute DGC preprocessing support.
radv: Add some initial graphics DGC preprocessing support.
radv: Add implementation of cmd buffers for a sparse binding queue.
radv: Remove the sparse binding queue from coherent images.
radv: Move sparse binding into a dedicated queue.
nir: Add nir_static_workgroup_size helper.
nir: Add pass for clearing memory at the end of a shader.
radv: Add option to clear LDS at the end of a shader.
radeonsi: Add support to clear LDS at the end of a shader.
radv: Use correct writemask for cooperative matrix ordering.
Benjamin Lee (14):
nak: make sm available in builders
nak: Legalize a bunch of instructions for SM50
nak: add IADD instruction for SM50
nak: implement ST* and LD* on SM50
nak: add ATOM{G,S} encoding for SM50
nak: add carry register file
nak: move iadd64 construction to a builder method
nak: use carry register file for IADD2
nak: make as_imm_not_{i,f}20 helper methods public
nak: implement SHL and SHR on SM50
nak: implement IMUL for SM50
nak: encode Dst::None as RZ on SM50
nak: implement SHFL on SM50
nak: implement VOTE on SM50
Boris Brezillon (74):
pan/genxml: Fix “{Last,First} Heap Chunk” field position
panfrost: Fix format_minimum_alignment() for v6-
pan/bo: Make sure we catch refcnt underflows
pan/genxml: Fix ‘Shader Program’ descriptor definition on v9 and v10
pan/decode: Print the resource table label
pan/decode: Make CSF decoding more robust to NULL pointers
pan/decode: Fix the pan_unpack() call for JUMP instruction unpacking
panfrost: Flag the right shader when updating images
panfrost: Kill unused panfrost_batch::polygon_list field
panfrost: Emit attribs in panfrost_update_state_3d() on bifrost/midgard
panfrost: Emit image attribs for compute in panfrost_update_shader_state()
panfrost: Rename panfrost_vtable::context_init
panfrost: Inline pan_emit_tiler_heap()
panfrost: Inline pan_emit_tiler_ctx()
panfrost: Count draws at the batch level
panfrost: Express the per-batch limit in term of draws
panfrost: Count the number of compute jobs at the batch level
panfrost: Make panfrost_has_fragment_job() public
panfrost: Stop using the scoreboard to check the presence of draws/compute
panfrost: Store the fragment job descriptor address in the batch
panfrost: Emit the fragment job from panfrost_batch_submit()
panfrost: Move the panfrost_emit_tile_map() call around
panfrost: Get rid of unused in_sync parameter in panfrost_batch_submit[_ioctl]()
panfrost: Get rid of the out_sync parameter in panfrost_batch_submit_jobs()
panfrost: Get rid of unused fb parameter passed to panfrost_batch_submit_jobs()
panfrost: Add a submit_batch() hook to panfrost_vtable
panfrost: Store the index pointer in panfrost_batch
panfrost: Stop passing vertex attribute arrays around
panfrost: Store varying related fields in panfrost_batch
panfrost: Use u_reduced_prim() to do the is_line check
panfrost: Move JM specific fields to their own struct
panfrost: s/panfrost_emit_vertex_tiler_jobs/jm_push_vertex_tiler_jobs/
panfrost: Move the JM-specific bits out of emit_fragment_job()
panfrost: Rename several job emission helpers
panfrost: Factor out the point-sprite shader update logic
panfrost: Factor out the vertex count logic
panfrost: Re-order things in panfrost_direct_draw()
panfrost: Move all JM-specific bits out of panfrost_direct_draw()
panfrost: Use batch->tls.gpu to store the compute TLS descriptor
panfrost: Move JM-specific bits out of panfrost_launch_grid_on_batch()
panfrost: Move JM specific bits out of panfrost_launch_xfb()
panfrost: Drop the vertex_count argument passed to panfrost_batch_get_bifrost_tiler()
panfrost: Rename panfrost_batch_get_bifrost_tiler()
panfrost: s/panfrost_emit_shader/jm_emit_shader_env/
panfrost: s/panfrost_emit_primitive/jm_emit_primitive/
panfrost: Rename JM-specific batch submission helpers
panfrost: s/preload/jm_preload_fb/
panfrost: s/init_batch/jm_init_batch/
panfrost: Prepare things for the common/JM cmdstream split
panfrost: Move JM helpers to their own source file
panfrost: Add a JOBX() macro to simplify job-frontend selection
panfrost: Fix multiplanar YUV texture descriptor emission on v9+
panfrost: Don’t leak NIR compute shaders
panfrost: s/pan_scoreboard/pan_jc/
panfrost: Rename pan_cs.{c,h} into pan_desc.{c,h}
panfrost: Make pan_afbc_compression_mode() per-gen
panfrost: Restrict job chain helpers to JM hardware
panfrost: Restrict job descriptor emission to JM hardware
util/hash_table: Use FREE() to be consistent with the CALLOC_STRUCT() call
util/hash_table: Don’t leak hash_u64_key objects when the entry exists
util/hash_table: Don’t leak hash_key_u64 objects when the u64 hash table is destroyed
panfrost: Abstract kernel driver operations
pan/kmod: Add a backend for the panfrost kernel driver
panfrost: Avoid direct accesses to some panfrost_device fields
panfrost: Avoid direct accesses to some panfrost_bo fields
panfrost: Back panfrost_device with pan_kmod_dev object
panfrost: Add a VM to panfrost_device
panfrost: Back panfrost_bo with pan_kmod_bo object
panfrost: Introduce a PAN_BO_SHAREABLE flag
panvk: Pass PAN_BO_SHAREABLE when relevant
panfrost: Flag BO shareable when appropriate
panvk: Fix tracing
panvk: Fix access to unitialized panvk_pipeline_layout::num_sets field
panfrost: Clamp the render area to the damage region
Boyuan Zhang (4):
gallium/pipe: define hevc max slices number
frontend/va: add support for multi slices reflist
radeonsi: add new interface to handle multi slice reflist
radeonsi/vcn: add new logic for hevc multi slices reflist
Brian King ((MEDIA)) (1):
d3d12: Add constraint_set1_flag support
Caio Oliveira (90):
anv: Fix leak when compiling internal kernels
intel/compiler: Remove unused parameter from brw_nir_adjust_payload()
intel/compiler: Take more precise params in brw_nir_optimize()
intel/compiler: Remove unused parameter from brw_nir_analyze_ubo_ranges()
intel/compiler: Clarify the asserts in nir_load_workgroup_id lowering
intel/compiler: Rework opt_split_sends to not rely/modify LOAD_PAYLOAD
intel/compiler: Re-enable opt_zero_samples() for Gfx7+
intel/compiler: Re-enable opt_zero_samples() in many cases for Gfx12.5
intel/compiler: Remove is_tex()
intel/compiler: Use linear allocator in parts of brw_schedule_instructions
intel/compiler: Remove reference to brw_isa_info from schedule_node
intel/compiler: Allocate all schedule_nodes at once
intel/compiler: Use array to iterate the scheduler nodes
intel/compiler: Add only available instructions to scheduling list
intel/compiler: Extract scheduling related basic functions
intel/compiler: Cache issue_time information
intel/compiler: Remove virtual calls from scheduler
intel/compiler: Move FS specific fields to fs_instruction_scheduler
intel/compiler: Merge child/latency arrays in schedule_node
intel/compiler: Tidy up code in scheduler related to reads_remaining
intel/compiler: Move earlier scheduler code that is not mode-specific
intel/compiler: Separate schedule_node temporary data
intel/compiler: Make scheduler classes take an external mem_ctx
intel/compiler: Reuse same scheduler for all pre-RA scheduling modes
intel/compiler: Clear up block instructions before re-adding them
intel/compiler: Simplify allocation of NIR related arrays
intel/compiler: Prefer ctor/dtors in some Google Tests
intel/compiler: Don’t use fs_visitor::bld in tests
intel/compiler: Don’t use fs_visitor::bld in fs_reg_alloc
intel/compiler: Don’t use fs_visitor::bld in thread payload classes
intel/compiler: Add a few more helpers to fs_builder
intel/compiler: Allow dumping CFG to a specific FILE*
intel/compiler: Sort lists of succs and preds in CFG dump output
intel/compiler: Add a few tests to opt_predicated_break
anv/xe2+: Use Region-based Tessellation redistribution
iris/xe2+: Use Region-based Tessellation redistribution
intel/compiler: Refactor program exit in intel_clc
intel/compiler: Use single variable instead of dynarray
intel/compiler: Fix memory leaks in intel_clc
intel/compiler: Remove the linking step in intel_clc
intel/compiler: Remove unused headers
intel/compiler: Move NIR emission code to brw_fs_nir.cpp
intel/compiler: Make a NIR intrinsic emission functions static
intel/compiler: Make more functions in NIR conversion static
intel/compiler: Make functions for NIR control flow conversion static
intel/compiler: Make setup functions of NIR emission static
intel/compiler: Make non-intrinsic NIR conversion functions static
intel/compiler: Make NIR atomic conversion functions static
intel/compiler: Make NIR resources helpers static
intel/compiler: Move nir_ssa_value into a local structure
intel/compiler: Move remaining NIR conversion fields to nir_to_brw_state
intel/compiler: Stop using fs_visitor::bld field in NIR conversion
intel/compiler: Annotate and use nir_to_brw_state::bld
intel/compiler: Don’t use fs_visitor::bld in remaining places
intel/compiler: Remove fs_visitor::bld
intel/compiler: Make fs_visitor not depend on fs_builder
intel/compiler: Make fs_builder include fs_visitor and not the other way
intel/compiler: Add ctor to fs_builder that just takes the shader
intel/compiler: Create and use nir_to_brw() function
intel/compiler: Use reference instead of pointer for nir_to_brw_state
intel/compiler: Use reference instead of pointer for fs_visitor
compiler/glsl: Reduce scope of is_anonymous
clover: Remove usage of glsl_type C++ helpers
compiler/types: Add a few more helpers to get builtin types
intel/compiler: Use C helpers to access builtin types
compiler: Remove C++ static member pointers to builtin types
intel/compiler: Use glsl_type C helpers
r600/sfn: Use glsl_type C helpers
nouveau: Use glsl_type C helpers
nir: Use glsl_type C helpers
mesa: Use glsl_type C helpers
lima: Use glsl_type C helpers
compiler/types: Add a few more glsl_type C helpers
glsl: Use glsl_type C helpers
compiler/types: Remove glsl_type C++ helpers
compiler/types: Use a typedef for glsl_type
intel/cmat: Add pass to lower cooperative matrix to subgroup operations
intel/dev: Add cooperative matrix configuration information
anv: Implement VK_KHR_cooperative_matrix
util: Add a way to set the min_buffer_size in linear_alloc
spirv: Use linear_alloc for parsing-only data
spirv: Use value_id_bound to set initial memory allocated
intel/fs: Only allocate acp_entry if we are adding one
intel/fs: Use linear allocator in opt_copy_propagation
intel/fs: Use linear allocator in fs_live_variables
anv: Don’t print warnings for GRL kernel compilations
intel/compiler: Use INTEL_DEBUG=cs to ask for brw_compiler output
nir: Disable -Wmisleading-indentation when compiling with GCC
ci: Add Werror=misleading-indentation to debian-clang
intel/compiler: Fix rebuilding the CFG in fs_combine_constants
Casey Bowman (1):
anv: Override vendorID for Diablo IV
Chia-I Wu (14):
radv: fix vkCmdCopyImage2 for emulated etc2/astc
radv: stop using vk_render_pass_state::render_pass
vulkan, tu, pvr: remove vk_render_pass_state::render_pass
radv: fix image view extent override for astc
radv: minor clean up to image view extent override
ac: be careful with stencil_offset override
radv: disable TC-compat htile on GFX9 in some cases
radv: fix VkDrmFormatModifierProperties2EXT for multi-planar formats
radv: fix VkSubresourceLayout2KHR for multi-planar formats with modifiers
radv: fix a typo in radv_image_view_make_descriptor
radv: fix asserts for radv_init_metadata
radv: convert a check in radv_get_memory_fd to assert
vk/util: ignore unsupported feature structs
Revert “vk/util: ignore unsupported feature structs”
Chris Spencer (7):
meson: Add option to ignore artificial Android limitations
android.mk: Add option to pass arbitrary parameters to meson
anv/android: Only limit advertised Vulkan version in strict mode
radv/android: Only limit advertised Vulkan version in strict mode
v3dv/android: Only limit advertised Vulkan version in strict mode
vn/android: Only limit advertised Vulkan version in strict mode
vulkan/android: Only limit advertised extensions in strict mode
Christian Gmeiner (13):
agx: Re-index nir defs to reduce memory usage
ci/etnaviv: Update ci expectation
etnaviv: rs: Call etna_rs_gen_clear_surface(..) when needed
etnaviv: Mark etna_rs_gen_clear_surface(..) private
docs: Update etnaviv extensions
etnaviv: Update headers from rnndb
etnaviv: Add static_assert(..) to catch memory corruption
isaspec: Add bool_inv type to print inverted bools
etnaviv: Add isaspec support
etnaviv: disassembler: Switch to isaspec
mesa: Drop not used program_written_to_cache
nir/opt_peephole_select: handle speculative ubo loads
pan/mdg: Use nir_builder for load_sampler_lod_parameters_pan
Colin Marc (1):
vulkan video: correctly set SPS VUI bits
Connor Abbott (32):
util/rb_tree: Fix editorconfig
util/rb_tree: Add augmented trees and interval trees
freedreno/ci: Remove minetest trace
v3d/ci: Remove minetest trace
vk,lvp,tu,radv,anv: Add common vk_*_pipeline_create_flags() helper
vk/graphics_state: Support VK_KHR_maintenance5
vk/graphics_state, tu: Rewrite renderpass flags handling
vk/graphics_state: Support VK_EXT_attachment_feedback_loop_dynamic_state
vk/graphics_state: Add vk_pipeline_flags_feedback_loops helper
tu: Assume no raster-order attachment access with NULL DS/blend state
tu: Fix order of rasterizer_discard check
tu: Make sure copies to half-float formats are bit exact
tu: Fix getting VkDescriptorSetVariableDescriptorCountLayoutSupport
ir3/ra: Don’t swap killed sources for early-clobber destination
nir: Add quad vote intrinsics
amd: Implement quad_vote intrinsics
nir/subgroups: Add option to lower Boolean subgroup reductions
amd: Enable boolean subgroup lowering
tu: Fix re-emitting VS param state after it is re-enabled
tu: Don’t use pipeline layout to emit shared const enable
tu: Rework dynamic offset handling
tu: Make filling out tu_program_state not depend on the pipeline
tu: Move shader linking to tu_shader.cc
freedreno/afuc: Handle store instruction on a5xx
freedreno/afuc: Add separate “SQE registers”
freedreno/afuc: Use SQE registers for call stack
freedreno/afuc: Add syntax for pre-increment addressing
freedreno/afuc: Decode (sdsN) modifier
freedreno: Update more control/pipe registers for a7xx
freedreno/afuc: README updates for a7xx
freedreno/afuc: Fix gen autodetection for a7xx
ir3/legalize: Fix helper propagation with b.any/b.all/getone
Corentin Noël (10):
mesa/bufferobj: ensure that very large width+offset are always rejected
virgl: fill the array_size value when using PIPE_TEXTURE_CUBE
virgl/texture: Align destination box to block depth
mesa/ffvs: Use gl_state_index16 in helpers directly
gallivm: Initialize indir_index to NULL before use
gallivm/lp_bld_nir_aos: Use TGSI instead of PIPE enum
mesa: Use a switch for state_iter and be more precise about its type
frontends/va: Remove wrong use of ProfileToPipe
virgl: Only send the same amount of data than declared in pipe_sampler_state
virgl: Assert build_id_note before dereferencing it
Daniel Almeida (33):
nak: derive From<OpFoo> for Op through a proc macro
nak: make Instr::new() generic
nak: compiler: add From<T:Into<Op>> for Instr
nak: compiler: replace Instr::new(..) with OpFoo {}.into()
nak: Heap-allocate Instrs
nak: Do not allocate vectors needlessly in optimization passes
nak: add support for floor, ceil and trunc
nak: run nir_lower_frexp and nir_opt_algebraic_late
nak: more lowerings
nak: change ishl data type to I32
nak: add support for nir_op_isign
nak: Add support for nir_op_bitcount
nak: add support for nir_op_bitfield_reverse
nak: add support for findmsb,findlsb
nak: add support for packhalf2x16_split
nak: add support for nir_op_unpack_half_2x16_split_{x|y}
nak: add support for atomic cmpxcgh on images
nak/sm50: rewrite encode_iadd2 to not use encode_alu()
nak: sm50: rewrite fsetp to not use encode_alu
nak: sm50: Rewrite fmnmx to not use encode_alu
nak: sm50: rewrite fmul to not use encode_alu
nak: sm50: rewrite fset to not use encode_alu
nak: sm50: rewrite iabs to not use encode_alu
nak: sm50: convert sel to not use encode_alu()
nak: sm50: convert i2f to not use encode_alu()
nak: sm50: rewrite encode_f2f to not use encode_alu()
nak: convert encode_imad to not use encode_alu()
nak: sm50: rewrite encode_popc to not use encode_alu()
nak: sm50: rewrite encode_prmt to not use encode_alu()
nak: sm50: remove encode_alu() and friends
nak/sm50: remove ALUSrc and friends
nak/sm50: remove *fmod* calls from iabs
nak: sm50: fix ineg legalization
Daniel Schürmann (24):
nir/lower_subgroups: optimize reductions with cluster_size == 1
nir: optimize open-coded quadVote* directly to new nir_quad intrinsics
aco: delete instruction selection for boolean subgroup operations
nir: remove info.fs.needs_all_helper_invocations
nir/gather_info: add missing wide subgroup operations
nir: add info.fs.require_full_quads
aco: enable helper lanes if shader->info.fs.require_full_quads
amd: rename max_wave64_per_simd -> max_waves_per_simd
aco: rename max_wave64_per_simd -> max_waves_per_simd
radv: fix number of physical SGPRs on GFX10+
aco: remove VCCZ and EXECZ register handling
nir/opt_loop: move loop control-flow optimizations into separate pass
treewide: replace calls to nir_opt_trivial_continues() with nir_opt_loop()
nir: remove nir_opt_trivial_continues()
nir: remove redundant passes from nir_opt_if()
nir/opt_loop_cf: generalize removal of “trivial” continues
aco: fix should_form_clause() for memory instructions without operands
aco: form clauses for LDS instructions
aco: add new post-RA scheduler for ILP
aco: refactor and speed-up dead code analysis
nir/opt_move_discards_to_top: don’t schedule discard/demote across subgroup operations
nir/gather_info: fix enumeration of wide subgroup intrinsics
aco: give spiller more room to assign spilled SGPRs to VGPRs
aco/insert_exec_mask: Fix unconditional demote at top-level control flow.
Daniel Stone (7):
ci: Try really hard to print final result string
ci/radeonsi: Occlusion queries are flaky on stoney
ci: Fix trivial typo in ARTIFACTS_BASE_URL
panfrost/ci: Remove Vulkan expectations from G57
panfrost/ci: Add environment variable to suppress warnings
panfrost/ci: Skip broken image copy tests
ci: Re-enable Collabora farm
Danylo Piliaiev (15):
tu: Fix reading of stale (V)PC_PRIMITIVE_CNTL_0
tu/a7xx: Zero out A7XX_VPC_PRIMITIVE_CNTL_0 in 3d blits
tu/a6xx: Exclude REG_A6XX_TPL1_UNKNOWN_B602 from reg stomping
tu/a7xx: Fix occlusion queries on pre-A740 GPUs
tu: Always print startup failure messages
tu: Return error when GPU is unsupported
freedreno/devices: Support Adreno 725
tu: Add a725 workaround dispatch at the start of each cmdbuf
freedreno/devices: Separate device definition into base + gen features
freedreno,tu,ir3: Pass fd_dev_info into ir3_compiler_create
freedreno,tu: Add env vars to modify fd_dev_info
freedreno: Add a644 support
freedreno/devices: Update a690 magic regs from WSL blob
turnip: Disable UBWC for D/S images on A690
freedreno: Disable UBWC for D/S images on A690
Dave Airlie (38):
vulkan: update video headers
vulkan/video: add support for h264 encode to common code
vulkan/video: add h265 encode support
vulkan/video: add h264 nal enum
vulkan/video: add a nal_unit lookup for hevc
util: add a bitstream encoder for video stream headers.
vulkan/video: add h264 level idc convertor utility
vulkan/video: add a h265 level translator.
vulkan/video: add h264 headers encode
vulkan/video: add h265 header encoders.
nak: fix backtrace crash running computeheadless
nak: make ipa encoding match the order in codegen gv100
nak: do perspective divide for interp none as well
nvk/xfb: set correct counter buffer for writing stream out counters.
nvk/nil: allow storage on VK_FORMAT_A2B10G10R10_UINT_PACK32
nvk: fix transform feedback with multiple saved counters.
nvk/nak/xfb: handle skipping properly when setting xfb_attr.
nvk: drop unneeded shader type conversion function
nvk/nak: fix regression with shf changes on sm70
intel/compiler: move gen5 final pass to actually be final pass
vulkan/video: drop encode beta checks and rename EXT->KHR
gallivm: handle llvm 16 atexit ordering problems.
intel/compiler: fix release build unused variable.
intel/compiler: revert part of “Move earlier scheduler code that is not mode-specific”
llvmpipe: fix caching for texture shaders.
gallivm/sample: refactor first/last level handling and use level_zero_only.
gallivm/sample: add some num_samples vs level zero only support
gallivm/sample: make the load_mip helper useful outside this file.
gallivm/lp: reduce size of lp_jit_texture.
gallivm/lp: reduce image descriptor size.
gallivm/lp: merge sample info into normal info
gallivm/lp: move sampler index around to reduce struct
lavapipe: bump .maxResourceDescriptorBufferRange
intel/compiler: reemit boolean resolve for inverted if on gen5
radv: don’t emit cp dma packets on video rings.
radv/video: refactor sq start/end code to avoid decode hangs.
radv: don’t submit empty command buffers on encoder ring.
gallivm: passing fp16_split_fp64 to fp16 lowering.
Dave Stevenson (2):
gallium: Add more TinyDRM drivers to the list of kmsro drivers
gallium: Add udl (DisplayLink) to the list of kmsro drivers
David Heidelberg (53):
ci/docs: add coreutils
ci: bump tags
ci/zink: reduce premerge testing on a618 to ~ 12 minutes
ci: hide Mesa install phase
ci: drop clover from release builds and remove rusticl build
ci: simplify debian-rusticl-testing definition
ci: drop mingw and wine from the x86_64 build container
ci: always cleanup pip and cargo leftovers
ci: bashify scripts, use arrays
ci: drop debootstrap, unused
ci/panfrost: run T860 traces as intended (nightly job)
ci/venus: reduce pre-merge to fit under 15 min
ci/alpine: do not store apk cache
ci/wine: move wine configuration into rootfs where is wine available
Revert “ci/wine: move wine configuration into rootfs where is wine available”
ci/lava: add wine into the amd64 ephemeral container packages
ci/zink: restore full premerge testing on Adreno 618
ci: fixup section names
ci/nouveau: define a kernel and dtb, so we can fetch it from external sources
ci: inject gfx-ci/linux S3 artifacts without rebuilding containers
ci/zink: disable nheko trace, as it sometimes crashes
gitlab: make commit more commit-like formatted
ci: tag sanity, rustfmt and clang-format job as a “placeholder” job
ci/traces: drop the freedoom-phase2-gl-high.trace
ci: disable Anholt farm
ci/freedreno: disable a660 as it’s down now
Revert “ci/freedreno: disable a660 as it’s down now”
ci: bump kernel to 6.6.4
docs: drop unused manual optimizations override
ci/freedreno: mark unvanquished-lowest trace as flaky and skip
ci/freedreno: switch Adreno 630 boards back to 6.4 kernel
ci/freedreno: increase fraction for Vulkan testing
ci/tu: add another failing pipeline strip draw
ci/freedreno: extend timeout for full runs
ci/freedreno: re-enable two Adreno 618 tests
ci/freedreno: timestamp-get no longer fails on Adreno
ci/freedreno: downgrade a618_piglit to 6.4 kernel
ci/freedreno: fail introduced by ARB_post_depth_coverage
rusticl: add freedreno alias for RUSTICL_ENABLE
ci/freedreno: more issues showed up on a618, let’s use 6.4
ci/austriancoder: separate HW definition from SW
ci/freedreno: downgrade whole Adreno 6xx series, incl. zink-a618 jobs
ci/broadcom: separate HW definition from SW
ci: skip EGL functional color_clears tests for Wayland
ci/lava: separate HW definitions from SW
ci/google: re-enable farm
ci/zink: update piano trace
ci/radeonsi: disable VA-API testing on raven
ci: enable ci-deb-repo for libdrm 2.4.119 (and others in the future)
ci/alpine: update to latest to get libdrm 2.4.119
ci: bump Fedora and Android libdrm2 to 2.4.119
ci/rootfs: add libdrm also inside the rootfs
ci/deqp: uprev deqp-runner for Linux too to 0.18.0
David Rosca (19):
frontends/va: Map decoder and postproc surfaces for reading
radeonsi/vce: Implement destroy_fence vfunc
radeonsi/uvd: Implement destroy_fence vfunc
radeonsi/uvd_enc: Implement destroy_fence vfunc
radeonsi/uvd_enc: Fix leaking session info buffer
Revert “radeon/radeon_vce: fix out of target bitrate in CBR mode (H.264)”
radeonsi/vce: Tweak motion estimation params for better quality
radeonsi/vce: Add VUI parameters in output bitstream
radeonsi/uvd_enc: Add VUI parameters in output bitstream
radeonsi: Fix offset for linear surfaces on GFX < 9
gallium/auxiliary/vl: Fix coordinates clamp in compute shaders
gallium/auxiliary: Fix coordinates clamp in util_compute_blit
gallium/auxiliary/vl: Scale dst_rect x0/y0 when rendering chroma plane
gallium/auxiliary/vl: Support interleaved input in deinterlace filter
Revert “frontends/va: Alloc interlaced surface for interlaced pics”
gallium/auxiliary: NIR blit_compute_shader
gallium/auxiliary/vl: NIR compute shaders
util/rbsp: Fill bits twice if reading more than 16 bits
radeonsi/vcn: Fix H264 slice header when encoding I frames
Dennis Bonke (1):
mesa: add managarm support
Dmitry Baryshkov (9):
freedreno/regs/mdp_common: change BPC1 -> BPC4
freedreno/regs/mdp_common: fix BPC comments
freedreno/regs: add mdp_fetch_mode enum
freedreno/drm: fallback to default BO allocation if heap alloc fails
ir3: fix shift amount for 8-bit shifts
ir3/a6xx: fix ldg/stg of ulong2 and ulong4 data
freedreno/drm: notify valgrind about FD_BO_NOMAP maps
freedreno/drm: don’t crash in heap allocator when run under valgrind
freedreno/drm: don’t crash for unsupported devices
Dudemanguy (1):
vulkan/wsi/wayland: fix wl_event_queue memory leak
Dylan Baker (3):
docs: add release notes for 23.2.1
docs: Add sha256 sum for 23.2.1
meson: add wrap for libdrm
Echo J (2):
nvk: Set HOST_CACHED_BIT for the GTT type
vulkan: Remove nonexistent output in vk_synchronization_helpers target
Eric Engestrom (236):
VERSION: bump to 24.0
docs: reset new_features.txt
docs: update calendar for 23.3.0-rc1
ci/rpi4: group all spec@ext_image_dma_buf_import@ext_image_dma_buf_import-sample_* together
ci/rpi4: add spec@ext_image_dma_buf_import@ext_image_dma_buf_import-sample_yvyu to the list of known failures
ci/zink+radv: add another flake on polaris
ci: drop confusing fake `rules`, `if` and `when` on the list of rules strings
docs/ci: allow sanity job to be missing
ci: don’t run sanity in Marge pipelines
ci: add `.never-post-merge-rules` to avoid re-running pre-merge jobs after merging
broadcom: use `.never-post-merge-rules` for all rpi tests
ci/radeonsi: add another flake
rpi4/ci: add more known dEQP-EGL.functional.*.*_context.gles*.other failures
rpi4/ci: move `spec@!opengl 1.1@depthstencil-default_fb-drawpixels-24_8 samples=2` from fails for flakes after an UnexpectedPass
rpi4/ci: remove `spec@!opengl 1.1@depthstencil-default_fb-drawpixels-32f_24_8_rev samples=2` from fails as it’s a flaky test and already marked as such
Revert “ci: backport two mesh/task query fixes for VKCTS”
ci/build-deqp: stop ignoring failures while fetching patches
ci/build-deqp: split deqp version into a variable
ci/build-deqp: move mkdir earlier
ci/build-deqp: print more detailed information about what deqp version is running
ci: bump image tags to rebuild deqp
ci/rules: add missing clang-format files to what needs containers to build
broadcom/ci: merge gl test lists to use a single deqp instance
broadcom/ci: fix list indentation
broadcom/ci: split broadcom-common manual rules to .broadcom-common-manual-rules
vc4/ci: add manual variant of .vc4-rules
v3dv/ci: add manual variant of .v3dv-rules
v3d/ci: add “full run” variant of v3d-rpi4-gl:arm64 as a manual job
v3dv/ci: add “full run” variant of v3dv-rpi4-vk:arm64 as a manual job
vc4/ci: add piglit “full run” variant of vc4-rpi3-gl:arm32 as a manual job
rpi4/ci: skip more timing out tests in the dEQP-VK.ssbo.layout.* group
zink+radv/ci: simplify deqp config
zink+radv/ci: ensure renderer is “zink on radv”
ci: restore sanity (aka. Revert “ci: don’t run sanity in Marge pipelines”)
gitlab_gql: strip newline at the end of the token file
ci_run_n_monitor: compile target_jobs_regex only once
ci/gitlab_gql: stop re-compiling regex now that all users pre-compile it
v3d/ci: run manual jobs in daily pipeline
radeonsi/ci: document new failures and flakes
ci: disable lima farm as it appears to be down
radv/ci: add navi21 flakes
radv/ci: add vega10 flakes
radv/ci: add polaris10 flakes
radv+zink/ci: add polaris10 flakes
radv+zink/ci: add navi10 flakes
bin/gitlab_gql: resolve sha locally to be able to use things like `HEAD`
gitlab_gql: make `–rev` optional, defaulting to `HEAD`
bin/gitlab_gql: fix command in example
bin/gitlab_gql: only get the pipeline when a pipeline is needed
v3d/ci: add new failures
bin/gitlab_gql: only allow a single `–print-*` argument per invocation
bin/gitlab_gql: rename get_job_final_definition() to print_…() since that’s what it actually does
bin/gitlab_gql: deduplicate fetch_merged_yaml() logic between print branches
bin/gitlab_gql: give a better name to the –print-job-manifest argument value than PRINT_JOB_MANIFEST
ci/valve-infra: ensure the correct farm picks up the job
docs: update calendar for 23.3.0-rc{2,3,4} and add another release candidate
util/xmlconfig: drop default SYSCONFDIR & DATADIR values
lima: drop unused lima_get_absolute_timeout()
intel/ci: fix gl/vk dependencies in hsw jobs
intel/dev: use libdrm.h wrapper to support builds without libdrm
ci_run_n_monitor: require user to add an explicit `.*` at the end if jobs like `*-full` are wanted
amd/ci: avoid re-running all the test jobs when changing the expectations for only one of them
egl/dri2: increase NUM_ATTRIBS to fit all the attributes
asahi: use util_resource_num() instead of open-coding it
ci/piglit: specify only the traces file in the job config
amd/ci: track changes to the traces config file as well
ci: fix kdl commit fetch
ci: uprev deqp-runner from 0.16.1 to 0.18.0
ci/deqp-runner: turn paths in errors into links
docs: update calendar for 23.0.0-rc5
docs: add another -rc
ci: use released version of meson
lp: make sure 0xff is unsigned before shifting it past signed int range
intel/perf: fix regex escaping
intel/ci: fix .hasvk-manual-rules
docs: update calendar for 23.3.0
docs/calendar: add 23.3.x releases
bin/python-venv: detect python version change
ci: disable opengl & gles in debian-vulkan build
radv/ci: add navi21-aco flake
bin/gen_release_notes: fix regex raw string
bin/python-venv: fix venv folder check
bin/gen_release_notes: include removed ‘new_features.txt’ in commit
docs: add release notes for 23.3.0
docs: add sha256sum for 23.3.0
docs: fix release date for 23.3.0
turnip: fix typo in comment
ci_run_n_monitor: allow picking a pipeline by its MR
amd/ci: radeonsi is gl, not vk
v3dv: update symbols that have become aliases for newer ones
v3dv: drop duplicate flag
radv: update symbols that have become aliases for newer ones
pvr: update symbols that have become aliases for newer ones
anv: update symbols that have become aliases for newer ones
hasvk: update symbols that have become aliases for newer ones
amd/ci: fix yaml indentation
amd/ci: split common amd files list from radeonsi files list
amd/ci: limit radv jobs to radv + aco files changes
nvk: update symbols that have become aliases for newer ones
vk/runtime: update symbols that have become aliases for newer ones
vk/wsi: update symbols that have become aliases for newer ones
vk/util: update symbols that have become aliases for newer ones
vk/overlay-layer: update symbols that have become aliases for newer ones
venus: update symbols that have become aliases for newer ones
venus: fix typo in comment
amd/ci: reuse .radeonsi-rules in .radeonsi-vaapi-rules
nvk: use `||` instead of `|` between bools
radeonsi/ci: update vangogh piglit expectations
freedreno/ci: add flake seen on a630
freedreno/ci: add more flakes seen on a630
freedreno/ci: add more a630 flakes
v3d: drop leftover from “move v3d_tiling to common”
radeonsi/ci: track changes to `vpelib`
turnip: update symbols that have become aliases for newer ones
util/blob: fix trivial typo
ci: explain what we mean by the various types of pipelines
ci: turn comment into code in `sanity` job rules
ci: identify merge request pipelines using `$CI_PIPELINE_SOURCE == merge_request_event` instead of `$CI_COMMIT_BRANCH` being missing
ci: rename is-pre-merge-for-marge to is-merge-attempt to be clearer
ci: drop containers, builds, and tests from post-merge pipeline
ci: add pipeline for direct pushes to main
ci: give an explicit priority to the scheduled nightly pipelines
ci: clean up pre-merge and fork pipelines rules
ci: make sure pre-merge pipelines have the same jobs as merge pipelines
ci: improve comments
ci: take microsoft farm offline
ci: fix rules for formatting checks
zink/ci: fix yaml indentation
zink/ci: use variable to avoid repeating the list
zink/ci: expand first (and only) level of folders in the list of files
zink/ci: run only the relevant jobs when changing the ci expectations
panfrost/ci: fix yaml indendation
panfrost/ci: run only the relevant jobs when changing the ci expectations
freedreno/ci: fix yaml indentation
freedreno/ci: run only the relevant jobs when changing the ci expectations
intel/ci: fix yaml indentation
intel/ci: deduplicate common intel files rules
intel/ci: expand first level of common intel files
intel/ci: anv changes should only trigger anv jobs
intel/ci: hasvk changes should only trigger hasvk jobs
intel/ci: run only the relevant jobs when changing the ci expectations
docs/calendar: add 24.0 branchpoint and release schedule
etnaviv/ci: fix yaml indentation
etnaviv/ci: expand first level of files in src/etnaviv/
etnaviv/ci: run only the relevant jobs when changing the ci expectations
broadcom/ci: avoid running the rpi4 jobs when changing the rpi3 expectations, and vice-versa
vk/update-aliases.py: drop dead –check-only
vk/update-aliases.py: allow specifying the files we want to update
vk/update-aliases.py: handle “no match” grep call
vk/update-aliases.py: sort files when informing the user of the matches
vk/update-aliases.py: simplify addition of other concatenated prefixes
vk/update-aliases.py: handle more concatenated prefixes
vk/update-aliases.py: enforce correct list order
vk/update-aliases.py: only apply renames for the vulkan api (not vulkansc)
v3dv/ci: only trigger on relevant changes
a630/ci: add another flake
freedreno/ci: move hang-y a630 jobs from pre-merge to nightly
spirv: add missing build dependency
ci/b2c: drop passthrough of unset CI_JOB_JWT
ci/b2c: stop ignoring errors in before_script
ci/b2c: fix indentation of comment and after_script: list
ci/b2c: drop unused B2C_EXTRA_VOLUME_ARGS
ci/b2c: tags are mandatory
ci/b2c: drop support for harbor.freedesktop.org
ci/b2c: drop unused –volume and –mount-volume
ci/b2c: always define job_volume_exclusions
ci/b2c: always define cmdline_extras
ci/b2c: use with:write instead of manually doing open;write;close
ci/b2c: export B2C_TEST_SCRIPT
ci/b2c: use envvars directly instead of converting them back and forth into cli args
ci/b2c: import all variables starting with `B2C_`
ci/b2c: rename B2C_TEST_SCRIPT to B2C_CONTAINER_CMD to match the automatic import
ci/b2c: identify dut by its id instead of its tags
docs: add release notes for 23.3.1
docs: add sha256sum for 23.3.1
docs: update calendar for 23.3.1
ci: deduplicate constructing the ARTIFACTS_BASE_URL
bin/gitlab_gql: fix –print-merged-yaml when –rev != HEAD
bin/gitlab_gql: print merged yaml as yaml instead of a python dict
v3d/ci: add flake
ci: fix indentation
ci: run every test when changing the build
docs: drop `:` in title
radv/ci: add flake
docs: document how to build the docs
vulkan/wsi: fix build when platform headers are installed in non-standard locations
ci/build: drop redundant meson/build.sh from jobs that already inherit from .meson-build
radv/ci: add flake on raven
ci: add nvk to the clang build
ci: disable collabora farm as it is currently offline
ci: fix farm restore pipelines
meson: always define {,DRAW_}LLVM_AVAILABLE one way or the other
docs: add release notes for 23.3.2
docs: add sha256sum for 23.3.2
docs: update calendar for 23.3.2
meson: update expat wrap
meson: update libarchive wrap
meson: update libxml2 wrap
meson: update zlib wrap
meson: use `allow_fallback` instead of manually listing the deps and what they provide
ci/containers: use build-libdrm.sh in debian/android
Revert “meson: add wrap for libdrm”
zink: update symbols that have become aliases for newer ones
zink/requirements: update feature and property names that have been promoted
docs/backport-mr: fix invalid nested formatting
docs: fix list whitespace
docs: mention that python package `packaging` is required on python 3.12+
lvp: update symbols that have become aliases for newer ones
egl: only accept APIs that are compiled in
ci: split & reuse debian version identifier
ci: convert several `find | xargs` to `find -exec`
ci/deqp: set default platform to `default` instead of glx, to also support wayland
docs: add release notes for 23.3.3
docs: add sha256sum for 23.3.3
docs: update calendar for 23.3.3
docs: close the 23.2 cycle
VERSION: bump for 24.0.0-rc1
.pick_status.json: Update to 4fe5f06d400a7310ffc280761c27b036aec86646
.pick_status.json: Mark 0557f0d59c5b22a8a934900ddc91f7a6057e146f as denominated
ci: make sure we evaluate the python-test rules first
.pick_status.json: Update to ff84aef116f9d0d13440fd13edf2ac0b69a8c132
.pick_status.json: Update to 10e2dbb63b9d1f8f35c4fc3f570cd19b3fc03b43
ci: fix job dependency error in MRs for bin/ci/* scripts
VERSION: bump for 24.0.0-rc2
ci/deqp: ensure that in `default` builds, wayland + x11 + xcb are all built
.pick_status.json: Update to d2b08f9437f692f6ff4be2512967973f18796cb2
.pick_status.json: Update to d0a3bac163ca803eda03feb3afea80e516568caf
.pick_status.json: Update to 90939e93f6657e1334a9c5edd05e80344b17ff66
.pick_status.json: Update to eca4f0f632b1e3e6e24bd12ee5f00522eb7d0fdb
VERSION: bump for 24.0.0-rc3
.pick_status.json: Update to b75ee1a0670a3207dfd99917e4f47d064a44197f
.pick_status.json: Update to 4cd5b2b5426e8d670fc3657eee040a79e3f9df1e
util: rename __check_suid() to __normal_user()
tree-wide: use __normal_user() everywhere instead of writing the check manually
util: simplify logic in __normal_user()
util: check for setgid() as well in __normal_user()
Eric R. Smith (1):
panfrost: fix panfrost drm-shim
Erico Nunes (6):
v3dv: Rework to remove drm authentication for wsi
lima/ci: update piglit ci expectations
Revert “ci: disable lima farm as it appears to be down”
panvk: Support modifiers for Wayland WSI
ci: lima farm is down
Revert “ci: lima farm is down”
Erik Faye-Lund (34):
docs: prepare for hawkmoth
docs: remove breathe/doxygen stuff
docs: improve readability of c-signatures
util: remove unused lut
panfrost: allow packing formats outside of pan_format.c
panfrost: bypass format-table for null-textures
panfrost: pass blendable formats to pan_pack_color
panfrost: store blendable_formats in panfrost_device
panfrost: look at correct blendable format version
panfrost: use perf_debug instead of open-coding
mesa/ffvs: use unreachable instead of assert
docs: apply permanent redirect
panfrost: do not open-code panfrost_has_fragment_job()
ci: opt-out panfrost from clang-format
panfrost: minify dimensions when converting modifiers
util/format: document NONE swizzle
lavapipe: do not use NONE-swizzle
panfrost: do not handle NONE-swizzle
d3d12: do not handle PIPE_SWIZZLE_NONE from sampler-view
zink: do not handle PIPE_SWIZZLE_NONE
meson: work around meson 0.62 issue
mesa/main: remove unused Log2 variants of width/height/depth
mesa/main: remove unused ClassID
mesa/main: use _mesa_is_zero_size_texture-helper
mesa/main: remove unused function
mesa/st: use _mesa_is_zero_size_texture-helper
zink: update profile schema
zink: use KHR version of maint5 features
panfrost: document ci failure
mesa/st: do not require render-target support for texture-only exts
mesa/st: do not check for emulated format
mesa: actually check for EXT_color_buffer_float support
mesa/main: require EXT_color_buffer_float for ES 3.2
mesa: check for float-format support
Etaash Mathamsetty (1):
driconf: add a workaround for Rainbow Six Siege
Faith Ekstrand (663):
nir: Add a lower_first_invocation_to_ballot option to lower_subgroups
nir: Add a lower_read_first_invocation option to lower_subgroups
nir/lower_bit_size: Fix subgroup lowering for floats
nir/lower_bit_size: Handle vote_feq/ieq separately
nir/lower_bit_size: Use u_intN_min/max()
nir: Split nir_lower_subgroup_options::lower_vote_eq into two bits
nir: Return b2b ops from nir_type_conversion_op()
nir/lower_bit_size: Use b2b for boolean subgroup ops
nir: add deref follower builder for casts.
nir: Handle wildcards with casts in copy_prop_vars
nir: Use nir_builder to insert movs
nir: Add asserts to nir_phi_builder_value_set_block_def
vc4: Stop assuming glsl_get_length() returns 0 for vectors
v3d: Stop assuming glsl_get_length() returns 0 for vectors
nir/lower_io_to_vector: Only call glsl_get_length() on arrays
nir/types: Support vectors in glsl_get_length()
nir: Handle array-deref-of-vec in vars_to_ssa
nir: Handle array-deref-of-vec in var split passes
nir/validate: Allow array derefs on vectors on function/shader_temp
nvk: Force all mappable BOs into GART pre-Maxwell
nvk: Fix nvk_heap_free() for contiguous heaps
nvk: Drop a bogus assert
nvk: Assert no storage images on Kepler
nir: Optimize boolean ieq/ine with an immediate
nouveau: Add initial headers and meson for the new compoiler
nak: Copy the optimization loop from Intel
nak: Add a bunch of shader lowering code in NIR
nak: Add initial stubs for rust code
nvk: Run shaders through NAK
nak: Add the core IR
nak: Add Rust bindings for NIR
nak: Add initial translation from NIR
nak: Add a copy-prop pass
nak: Add a dead-code pass
nak: Add a util library
nak: Add a trivial register allocator
nak: Add a lowering pass for VEC and SPLIT instructions
nak: Add a lowering pass for ZERO sources and destinations
nak: Add bitset infrastructure
nak: Add encoding for a few instructions
nak: Encode program headers
nak: Header stuff
nak: Lower system values to a new load_sysval_nak intrinsic
nak: Implement load_sysval_nv as S2R
nak: Implement load_ubo
nak: Implement load/store_global
nak: Zero out the .w component of descriptors
nak: Add an instruction fuzzing tool
nak: Implement iadd and ishl
nak: Add a pass for computing instruction dependencies
nak: Implement 32-bit logic ops
nak: Add support for instruction predicates
nak: Implement integer comparisons
nak: Implement bcsel
nak: Rework ALU instruction encode
nak/meson: Use bindgen dependencies
nak: Add nak_compiler_create/destroy
nvk: Pass an actual nak_compiler to nak_compile_shader()
nak: Plumb the SM through to nak::Shader
nak: Encode load/store correctly on SM80
nak: Rework instruction encoding
nak: Implement boolean logic ops
nak: Lower 8 and 16-bit types
HACK: Support old meson
nak: Use Instr::num_srcs/dsts() less
nak: Get rid of meta instructions
meson: Pull in syn from crates.io
nak: Add SrcAsSlice and DstAsSlice traits
nak: Add a SrcModsAsSlice trait
nak: Use a different inner struct type for each opcode
nak: Use Src::Zero for load_const(0)
nak: Handle zeroes at emit time
nak: Implement i2f
nak: Implement fadd
nak: Rework integer compare ops
nak: Implement float comparisons
nak: Implement nir_op_b2f32
nak: Implement unary float and integer ops
nak: Allow iadd3 to take an immediate in srcs[2]
nak: Implement fsign
nak: Rework ALUSrc in emit code
nak: Rework source modifiers
nak: One of the predicates in IADD3 is a destination
nak: Implement Display for SSAValue
nak: Make Dst its own type
nak: Add modifier propagation
nak: Implement basic control-flow
nak: Move nak_compiler to nak_private.h
nak: Add a nir_shader_compiler_options to nak_compiler
nvk: Pull the NIR options from NAK
nak: Implement b2i32
nak: Implement iadd64
nak: Implement phis
nak: Add a union-find implementation
nak: Lower global access to scalars as needed
nak: Print names of missing instructions
nak: Implement unpack_64_2x32_split_*
WIP: nak: Rework the barrier assignment pass
nak: Add an SSAValueAllocator struct
nak: Pass an SSAValueAllocator through to map methods
nak: Handle fadd funnyness in the emit code
WIP: nak: Add a legalization pass
nak: Rename Imm to Imm32
nak: Add separate True and False source types
nak: Handle phis with non-SSA sources
nak: Support both destinations in PLOP3
nak: Drop the special cases for single-component vec/split
nak: Don’t emit MOVs for overlapping vec and split src/dst
HACK: nak: Lower iadd64 again
nak: Add a parallel copy in struction with lowering
nak: Use OpParCopy for OpVec and OpSplit lowering
nak: Get rid of the BitSet and BitSetMut traits
nak: Rename BitSetView to BitView
nak: Add a BitSet struct
nak: Add an SSAComp struct
nak: Rework dead-code
nak: Rework phis
nak: Add a space to the end of vec and split arg lists
nak: Add a liveness analysis pass
nak: Add a non-trivial register allocator
nak: Improve the dependency tracker
nak: Handle token re-use in dep tracking
nak: Implement nir_op_i(eq|ne) for booleans
nak: Fold [P]Lop3 sources
nak: Predicates default to true
nak: Implement nir_op_[iu](min|max)
nak: Implement nir_op_fmul
nak: Implement nir_op_(fmin|fmax)
nak: Implement nir_op_u2f
nak: Implement nir_op_vecN
nak: Implement MuFu and a bunch of float unops
nak: Move nak_sysval_attr_addr/sysval_idx higher in the file
nak: Implement input interpolation
nak: Handle multiple vector destinations in RA
nak: Use immediage offsets for load/store_global
nak: Implement OpFSOut with an OpParCopy
nak: Implement f2[iu]32
nak: Wire up ffma
nak: Add more legalization
nak: Implement right-shifts
nak: Implement nir_op_[iu]mul[_high]
nak: Enable nir_lower_idiv
nak: Add a NIR texture lowering pass
nak: Use more core NIR texture lowering
nak: Wire up texture ops
nak: Simplify the FromVariants proc macro
nak: Simplify the (Srcs|Dsts)AsSlice proc macro
HACK: spirv: Add a MESA_SPIRV_DUMP_PATH environment variable
nak: Add a NAK_DEBUG environment variable
nvk: Drop printing of NAK shaders
nvk: Pass NAK flags through to shader cache UUIDs
nak: Add a debug flag to assign worst-case instruction deps
nak: Rework vector handling
nak: Legalize vector sources
nak: Add a use tracker to RA
nak: Much more believable try_find_unused_reg_range()
nak: Implement nir_op[iu]mul_2x32_64
Revert “HACK: nak: Lower iadd64 again”
nak: Implement nir_op_ixor
nak: Implement undef instructions
nak: Implement image load/store
nak: Wire up OpLd and OpSt for local and shared
nak: Implement nir_intrinsic_load/store_scratch
nak: Add a smarter new_lop2 helper
nak: Improve RA failure messages
nak: Legalize OpShf
nak: Only put actually live SSA values in the ra.live_in sets
nak: Legalize more stuff
nak/nir: Lower image size and samples to txq
nak: Improve [FI]SETP encoding
nak: Legalize Op[FI]Setp
nak: Don’t allow r255 in texture or surface ops
nak: sin() and cos() require we divide by 2pi
nak: Add F2F and implement fquantize16
nak: Implement barriers
nvk: Plumb num_barriers through from NAK
nak: Implement load/store_shared
nak: Integers don’t have abs() source modifiers
nak: Add a mechanism for decorating sources with types
nak: Decorate sources with types
nak: Only divide FS inputs by .w for smooth interpolation
nak: Rework source modifiers a bit
nak: Add a Src::supports_src_type() helper
nak: Rework copy-prop to use soruce type decorations
nak: Implement nir_intrinsic_global_atomic_*
nak: Implement nir_intrinsic_shared_atomic_*
nak: Implement global/shared_atomic_comp_swap
nak: Implement image atomics
nak: Fix the 2nd predicate on LOP3
nak: Optimize OpLop3 and OpPLop3
nak: DCE things with constant false predicates
nak: Rework source modifiers instructions a bit
nak: Fold fsat into FAdd/FFma/FMul
nak: Delete unused imports and dead code
nak: Add accum predicates to Op[FI]Setp
nak: Add a Pred struct move the enum to PredRef
nak: Fix multisampled textureing
nak: Legalize everything
nak: Rework cbufs a bit
nak: Implement indirect UBO loads
nak: Implement nir_op_b2b1 and nir_op_b2b32
nak: Follow memcpy semantics with OpParCopy
nak: Work in terms of bits for type sizes
nak: Add a builder
nak: Use the builder in some lowering passes
nak: Compute liveness in reverse block order
nak: Rework liveness to add next-use information
nak: Add a PerRegFile helper struct
nak: Record register pressure in liveness
nak: Initialize RA with only live registers
nak: Use num_regs instead of max_reg in RA
nak: Use pcopy.push() in RA
nak: Rework RA a bit
nak: Add some documentation for SSA values
nak: Print to stderr
nak/ra: Pass a PerRegFile num_regs into the allocator
nak: Allocate the minimum number of GPRs.
nak: Separate the CFG from liveness
nak: Break guts of liveness into traits
nak: Require Rust 1.70.0
nak: Handle dead destinations in RA
nak: Make calc_max_live a function of the Liveness trait
nak: Bring back bitset-based liveness
nak: Add mum_gprs and tls_size to Shader
nak: Accurately set num_gprs
nak: Add a RegFileSet struct
nak: Add more SSA iterator options
nak: Add a new VecPair type
nak/nir: Add more helpers
nak: Emit if branches in the predecessor block
nak: Add a more awesome CFG data structure
nak: Store the blocks in the CFG
nak: Base liveness on CFG indices
nak: Add loop detection to the CFG
nak: Add a phi allocator
nak: Refactor nak_assign_regs a bit
nak: Use u32 for register indices
nak: Rework map_instrs()
nak: Add a new OpCopy instruction for parallel copy lowering
nak: Use the builder for the legalize pass
nak: Use OpCopy in legalize
nak: Use more OpCopy
nak: Add a Mem register file
nak: Handle RegFile::Mem in parallel copy lowering
nak: Allow DCE on functions
nak: Restructure liveness construction
nak: Add interference helpers
nak: Add a dominance check to CFG
nak: Add helpers to BasicBlock to get phis
nak: Add a to-CSSA pass
nak: Add an SSA repair pass
nak: Union find
nak/ra: Drop the pointless AssignRegs struct
nak/ra: Handle parallel copies as a special case
nak/ra: Don’t free killed for OpPhiSrcs
nak: Expose LiveSet for incremental liveness tracking
nak: Add a RegFileSet filter to NextUseLiveness::for_function()
nak: Add more NextUseLiveness helpers
nak: Add a spilling pass
nak: Use the correct number of GPRs on Turing+
nak: Spill registers before RA
nak: Add a debug flag to test spilling
nak: Implement shader clock
nak/ra: Improve coalescing
nak/spill: Tweak the construction of S sets
nak: Document spilling and RA
nak: Add an alloc_vec() to SSAValueAllocator
nak: Move all the IADD3 insanity to a new OpIAdd3X opcode
nak/legalize: Fix too many IADD3 source modifiers
nak: Disable lower_image_size_to_txs for NAK
nak: IMAD also has a destination predicate
nak: Remap GLSL_SAMPLER_DIM_SUBPASS and SUBPASS_MS to 2D and MS
nak: Fix instruction ordering in nak_ir.rs
nak: Rename OpBFind to OpFlo
nak: Implement Index[Mut] for RegTracker
nak: Use the right number of predicates in RegTracker
nak: Rework the barrier insert pass
nak: Rework calc_delay.rs
nak: Re-work Instr::get_latency()
nak: Emit FS_OUT before EXIT
nvk: Use sysvals for fragcoord etc. with NAK
nak: Handle flat FS inputs
nak: Add support for centroid and sample interp modes
nak: Use load_interpolated_input for frag_coord
nak: Properly handle OpFSOut in RA and liveness
nak: Handle empty OpFSOut
nak/nir: Several FS output fixes
nak: Implement load_sample_id and load_sample_mask_in
nak: Implement discard and demote
nak: Set TLS size properly in the shader header
nvk,nak: Plumb through the zs_self_dep key bit
nak: Use count_attribute_slots for FS input var sizes
nak: Pull sm, num_gprs, and tls_size into a ShaderInfo struct
nak: Stash a ShaderInfo in ShaderFromNir
nak: Rework FS outputs again
nak: Re-plumb compute shader info
nak: Plumb more FS info through to the C API
nvk/nak: Translate our new FS flags from NAK to nvk_shader
nak: Saturate depth writes
nak: Add support for gl_FrontFace
nak/nir: Fix helper invocations
nak/nir: Use nir_shader_intrinsics_pass for FS inputs
nak: Handle interpolate_at_offset
nak: Take components into account in load_*input
nak: Plumb uses_kill through from nak_from_nir
nak/nir: Plumb the FS key into lower_fs_input_intrin
nak/nir: Move frag_coord/sample_pos lowering to FS input lowering
nak/nir: Fix sample vs. pixel input interpolation
nak/nir: Add a load_frag_w helper
nak/nir: Interpolate gl_PointCoord
nak/nir: Return one sample for gl_SampleMaskIn[0] when sample shading
nak: Fold source modifiers in legalize
nak: Provide more detail when printing IR after passes
nak: Handle modifiers in dedup_srcs() in opt_lop()
nvk: Add a helper for lowering system values to root table loads
nvk: Lower more draw system values
nak: Take component into account in store_output
nak: Fix printing of OpASt
nak: Move NIR enum translation out of nak_sph.rs
nak: rustfmt fixes
nak: Simplify I/O gathering
nvk: Set clip/cull_enable for NAK shaders
nak: Run simple liveness data-flow bottom-up
nak/bitset: Add a helper for modifying in-place
nak: Don’t allocate bitsets in liveness data-flow
nak: Handle non-constant I/O offsets
nouveau/parser: Dump SET_STREAM_OUT_CONTROL_* properly
nak: Translate XFB info
nvk: Plumb through XFB info from NAK
nak: Add a Label struct for branch targets
nak: Add OpNop which can have a label
nak: Break indirect offset encoding into a helper
nak: Allow encoding Dst::None
nak: Add barrier instructions
nak/builder: Return the instruction from push_*()
nak: Implement NIR control barriers
nak: Implement From for SrcRef for more types
nak: Add enums for sysvals and attributes
nak: Plumb clip/cull enables through nak
nak/nir: Lower tessellation and geometry I/O
spirv: Fix locations for per-patch varyings
nak: NVIDIA calls them tessellation init shaders
nak: Rework OpALd and OpASt a bit
nak: Set per patch attribute count both places in the SPH
nak: Handle location_frac for FS outputs in nak_from_nir.rs
nak: Add lowering for per-vertex I/O
nak: Implement more attribute I/O
nak/nir: Lower load_primitive_id
nak,nvk: Plumb through tessellation info
nak: Implement load_tess_coord
nak: Fix lowering for patch_vertices_in
HACK: Only emit OpBar in compute shaders
nak/nir: Use count_vec4_slots instead of count_attribute_slots
nak: Add NIR lowering for attribute I/O
nak/nir: Lower systm values before lowering I/O
nak: Use nak_nir_lower_vtg_io
nak: Fix a bunch of warnings
nak: Fix opt_out
nak/bitset: Improve set_words()
nak/bitset: Add an is_empty() helepr
nak/bitset: Fix next_set()
nak/sph: Round tls_size up to a multiple of 16
nak: Fix repair_ssa() for back-edges
nak: Fix parallel copy handling in spilling
nak: Fix to_cssa()
nak/nir: Don’t lower 1-bit phis
nak: Support encoding -Zero
nak: Fix fneg to do fadd(-0, x)
nak: Rename lower_vec_split() to lower_ineg()
nak: Use Src::From<u32> and Src::From<bool>
nak: A quick rustfmt fix
nak: Upgrade to more modern meson
nak: Add some #[allow(dead_code)]
nak: Drop some unused helpers
nak: Get rid of dead code warnings in RegFileSet
nak: Get rid of warnings in nak_sph.rs
nak: Drop the final calc_max_live() after GPR spilling
nak: Don’t print a range for one register
nir: Add nvidia barrier intrinsics
nak/nir: Add a pass for adding convergence barriers
nak: Add OpBreak
nak: Handle control-flow barriers
nak: Use barriers for re-convergence
nak: Remove unnecessary control barriers
nak: Call nir_lower_subgroups()
nak: Use nir_shader_intrinsics_pass for system values
nak: Lower subgroup_id and num_subgroups
nak/nir: Allow boolean vote_ieq
nak/nir: Zero-pad subgroup masks
nak: Implement vote and ballot
nak: Fix the encoding of OpShfl
nak: Implement read_invocation and shuffle_*
nak: Allow 1-component image load/store
nak: Emit CCtl in barriers with acq/rel semantics
nak: Use strong ordering for Image load/store
nak: Use the simplified BAR.SYNC encoding
nak: Emit MemBar before Bar
nak: Insert an OpNop after OpBar
nak: Document a bit in encode_lds()
nvk: Enable subgroups features
nak: Rely on Rust 1.73 for next_multiple_of() and div_ceil()
nak: Require meson 1.3.0 and clean up a couple bits
meson: Set build.rust_std
ci: Bump container images for NAK dependencies
ci: Add syn to –force-fallback-for
ci: Update the python env for ci_run_n_monitor.py
nvk: Default to NAK on Turing+
nvk: Stop asserting 11-bit storage image handles
nvk: Free NAK shaders
nak: Fix copy-prop for OpPLop3 sources
nak: Drop OpAtomCas in favor of OpAtom with atom_op == CmpExch
nak: Make ALD/AST.PHYS a boolean
nak: Make encode_sm75 a method of Shader
nak: Plumb the nak_compiler through to lower_fs_input_intrin
nak: Rework FS input interpolation
nvk: Only advertise VK_KHR_shader_terminate_invocation if using NAK
nvk: Handle load_first_vertex in nvk_nir_lower_descriptors()
nak/nir: Lower indirect FS inputs
nvk: Only lower outputs to temporaries
nvk: Add a codegen helper for nir_shader_compiler_options
nvk: Move a bunch of codegen-specific lowering to helpers
nvk: Move the optimization loop to the nvk_codegen.c
nvk: Move the guts of nvk_compile_nir() to nvk_codegen.c
nvk: Move even more lowering into nvk_codegen.c
nvk: Use nak_fs_key instead of rolling our own
nak: Rename TLS to SLM
nak: Properly prefix nak_xfb_info
nak: Move clip, cull, and XFB into a nak_shader_info.vtg
nak: Add a writes_layer bit to nak_shader_info::vtg
nak: Handle the num_gpr offsetting inside nak
nvk: Use nak_shader_info natively
nak: Enable SM70 for Volta
nak: Stop passing undefs to ipa_nv
nak: Support dumping shader assembly as part of compile
nvk: Don’t set pipeline->base.type manually
nvk: Implement VK_KHR_pipeline_executable_properties
nvk: Drop nouveau_ws_bo_new_tiled()
nvk: Rework error handling in nouveau_ws_bo_new() and from_dma_buf()
nvk: Handle VMA allocation failure
nvk: Add a separate VMA heap for BDA capture/replay
nvk: Implement bufferDeviceAddressCaptureReplay
nvk: Advertise VK_KHR_synchronization2
nvk: Set the right API version in the ICD json files
nak: Add the predicate destination to OpShfl
nak: Add builder helpers for a few ops
nak: Use c == 0x0 for shuffle_up
nak: Lower scan/reduce in NIR
nak: Implement quad ops
nvk: Advertise the rest of the subgroup ops
nak: Rework reg and SSA value printing
nak: Make most Display stuff lower-case
nak: Rework opcode printing to use a new trait
nak: Implement DisplayOp on Op instead of Display
nak: Default InstrDeps::delay to 0
nak: Only write deps.delay when set
nak: Align instructions when printing
nak: Display memory access bits with the “.” prefix
nak: Make MemAddrType a part of MemSpace
nak: Display memory type at the end for load/store ops
nak: Rework printing of texture and image dims
nak: Two more print fixes
nak: gl_FragCoord and gl_PointCoord are screen-space interpolated
nvk/codegen: Fragment shader builtins are noperspective
nvk: Wire up MESA_VK_VERSION_OVERRIDE
nvk: Limit shader stages to supported stages
nak: Run rustfmt
nak: Only insert barriers around ifs if they actually re-converge
vulkan: Default override patch version to VK_HEADER_VERSION
nvk: Advertise Vulkan 1.1 on Turing+
nak: Drop the PrmtSelection stuff
nak: Add a builder helper for OpPrmt
nak: Rework OpPrmt a bit
nak: Implement nir_op_extract_*
nak: Fix int8/16 lowering
nak: Add base support for 8 and 16-bit types
nak: Implement more int/float conversions
nak: Implement integer conversions
nak: Handle non-DW-aligned UBO loads
nvk: Enable 8 and 16-bit integer types
nak: Implement scan/reduce on booleans
nak/nir: Handle CBuf alignment rules
nak: Revert “nak: Handle non-DW-aligned UBO loads”
nvk: Use the copy engine for CmdFillBuffer
nvk: Use the copy engine for NVK_DEBUG=zero_memory
nvk: Stop initializing the 2D engine
vulkan: Move vk_synchronization2 to vk_synchronization
vulkan: Add some auto-generated synchronization helpers
vulkan: Add helpers for pipeline stage flags
vulkan: Add helpers for access flags
nvk: Move Begin/EndTransformFeedback to nvk_cmd_draw.c
nvk: Rework transform feedback stalling
nvk: Implement vkCmdPipelineBarrier2 for real
nvk: Drop unnecessary per-draw/dispatch cache maintenance
nvk: Drop MME_DMA_SYSMEMBAR before indirect draw/dispatch
nak: Drop a bunch of SET_REFERENCE from the pre-Turing paths
nvk: Advertise VK_EXT_subgroup_size_control
nil: Add support for filling out linear texture headers
nouveau: Rename nvidia-headers to headers
nouveau: Move headers/classes to headers/nvidia/classes
nak: Run rustfmt again
nak: Fix integer roll-over when we have a u64vec4
nak: Set .64/.32 on CSSR as needed
nak/nir: Don’t use nir_lower_bit_size on 64-bit values
nak: Implement 64-bit ineg
nak: Natively implement 64-bit shifts
nak: Lower isign in NIR
nak: Rework printing of comparisons
nak: Implement 64-bit comparisons
nak: Don’t ask NIR to lower [iu]mul64_2x32
nak: Use the right source types for I2F, F2I, and F2F
nak: Fix encoding of 64-bit F2I, I2F, and F2F
nak: Implement b2i64
nak/nir: Don’t lower 64-bit conversions
nvk: Advertise shaderInt64
nvk: Advertise VK_EXT_shader_subgroup_ballot/vote
nak/nir: Handle non-32-bit data in lower_scan_reduce
nvk: Advertise KHR_shader_subgroup_extended_types
nvk: Advertise VK_KHR_shader_atomic_int64
nak/nir: Trim image load/stores based on format
nak: Lower 64-bit image load/store
nak: Handle 64-bit image atomics
nil: Add R64_SINT and R64_UINT formats
nvk: Don’t disable non-texturable formats
nvk: Implement VK_EXT_shader_image_atomic_int64
nak: Simplify Src::is_predicate()
nak: Replace OpBMov with OpBClear
nak: Fix scheduling for control barriers
nak: Add a barrier register file
nak: Add back OpBMov with better semantics
nak: Add support for spilling barriers
nak: Take num_barriers from RA
nak: Make barriers SSA-friendly
nak: Force RA to allocate bar_in/out to the same register
nak: Add a barrier propagation pass
dxil: Use mesa_prim consistently
glsl: Properly remap GL_* to MESA_PRIM
intel/vec4: Use MESA_PRIM_* instead of GL_*
nir: Return a mesa_prim from gs_in_prim_for_topology
compiler: Fix a comment
radeonsi: Drop an unnecessary cast
nvk: Advertise VK_EXT_scalar_block_layout
nak: Advertise subgroupBroadcastDynamicId
nak: Add a B32 source type
nak: Rework the OpIAdd3/OpIAdd3X split
nak/legalize: Handle the src0/1 source mod condition for OpIAdd3X
nak: Legalize immediates with source modifiers
nak: Implement uadd_sat
nak: Implement usub_sat
nvk: Implement VK_EXT_texel_buffer_alignment
spirv: Plumb variable alignments through to NIR
nir: Respect variable alignments in lower_vars_to_explicit_types
nak: rustfmt
nak: Restructure for better module separation
ci: Also rustfmt binaries
nir: Split has_[su]dot_4x8 bits into regular and _sat versions
nir: Lower [su]dot_4x8_[ui]add_sat to [su]dot_4x8_[ui]add
microsoft: Stop claiming dot_4x8_sat support
nak: Rework printing of int/float types and rounding modes
nak: Wire up DP4
nvk: Advertise KHR_shader_integer_dot_product
nak: Split legalize into per-SM functions
nak: Initial WIP SM50 backend
nak: Rework set_src_imm20 in nak_encode_sm50
nak: Rewrite SM50 encode_fadd to not use encode_alu
nak: Rename LogicOp to LogicOp3
nak: Use OpLop2 and OpPSetP pre-SM70
nak: Rework the SM50 encoding of isetp
nak: Add SM50 encodings for ALD and AST
nak: Only split texture destinations on Volta+
nak: Rework nvfuzz for SM50
nak/nv50: Rewrite the encoding of OpShf
nak/sm50: Wire up tex ops
nak: Rewrite the SM50 encoding of OpF2I
nak/sm50: Rewrite the encoding for OpIMnMx
nak: Implement FS input interpolation on SM50
nak/sm50: Rewrite the encoding for OpMov
nak: Drop the SM50 encoding of BREV
nak/sm50: Add better helpers for encoding sources with modifiers
nak/sm50: Stop using ALUSrc for IADD2
nak/sm50: Drop src_mod_has* in favor of core helpers
nak: Clean up compiler warnings
nak: Add barriers on Volta
nak/nvfuzz: Add an SM parameter
nak: Drop the fmnmx from Builder
nak: Add an ftz bit to a bunch of float ops
nak: Plumb through float controls
nvk: Advertise VK_KHR_shader_float_controls
nak: Plumb through float controls for fset[p]
nak: Plumb through float controls for frnd[p]
nak: Add dnz bits to OpFMul and OpFFma
nak: Audit remaining FTZ/DNZ bits on sm70+
nak: Audit sm50 for FTZ/DNZ bits
nak: Clean up instruction printing a bit
nak: Rework barrier handling a bit
nvk: Make NVK_DEBUG=push an alias for push_dump
nvk: s/device/dev in nvk_descriptor_set_layout.c
nvk: Plumb a physical device into descriptor_stride_align_for_type
nvk: Add a nvk_min_cbuf_alignment() helper and use it
nvk: Add an NVK_MIN_TEXEL_BUFFER_ALIGNMENT #define
nak: Reduce minStorageBufferAlignment
nvk: Simplify alignment limit plumbing
nvk: CBuf alignment reduces to 64B on Turing
nvk: Throw Tegra behind NVK_I_WANT_A_BROKEN_VULKAN_DRIVER
nvk: Rework the way we set up memory heaps/types
nir: Add a new has_fmulz_no_denorms flag
nak: Set .ftz on f32 ops by default
nak: Implement fmulz and ffmaz
nvk: Enable NAK by default for Volta
nak: Don’t set both FTZ and DNZ at the same time
nvk: Implement VK_EXT_multi_draw
nak: Add a delay of 2 cycles for barriers
nak: Rework the dependency pass
nak: Handle negative cbuf offset immediates
nak/sm50: Fix immediate encodings
nak/sm50: Fix legalization of OpIAdd
nak/sm50: Add legalization and encoding for OpLdc
nvk/nir: Add cbuf analysis to nvi_nir_lower_descriptors()
nvk/nir: Lower UBO loads to load_ubo when we have a cbuf
nvk: Add a cbuf_bind_map to nvk_shader
nvk: Stash descriptor set sizes
nvk: Rework push_indirect to take an address
nvk: Set MME_DATA_FIFO_CONFIG on device init
nvk: Don’t flush descriptors in BeginConditionalRendering
nvk: Upload cbufs based on the cbuf_map
nvk: Add debug flags to the physical device
nvk: Enable cbufs
nvk: Use ENUM_PACKED for enums instead of PACKED
nir: Scalarize bounds checked loads and stores
nak: Switch to //-style comments
nak: Plumb shader model into instruction latency queries
nak: Handle minimum execution latencies in the dep tracker
nvk: Advertise VK_KHR_vulkan_memory_model
nvk: Use render->color_att_count for color write enables
nvk: Support extendedDynamicState3ColorWriteMask
nak: Move the copy detection part of opt_copy_prop to a helper
nak: Fix copy-prop for fp64
nak: Copy propagate and constant fold OpPrmt
nak: Make OpAtom::cmpr a GPR source
nak: Pass SrcTypes around instead of RegFile in legalize
nak/sm70: Allow src2 of 3src ops to be an immediate
nak: OpDAdd doesn’t have saturate
nak: Rework encoding of ALU instructions on SM70+
nak: Add the rest of the double-precision ops
nak: Split fmul/ffma handling from fmulz/ffmaz
nak: Wire up 64-bit nir_op_fadd/ffma/fmul and comparisons
nak: Fix nir_op_f2f64
nak: Implement b2f64
nak/nir: Set nir_lower_io_lower_64bit_to_32 for varyings
meson: Update our rust dependencies
nak: Fix encoding of dsetp with RZ on SM70+
nak: Implement 64-bit nir_op_fsign
nak/sm50: Add encoding and legalization for dadd/dfma/dmul/dsetp
nak/sm50: Fix encoding of f20 immediates
nak/sm50: Fix encoding of iadd with imm32
nak/sm50: Properly legalize OpSel and drop an assert
nak/sm50: Add DMnMx and use it for fp64 fmin/fmax
nir/lower_doubles: Add lowering for fmin/fmax/fsat
nak/nir: Lower a bunch of fp64
nvk: Advertise shaderFloat64
nvk: Free shaders created by codegen
nvk: Unref shaders on pipeline free
nvk: Don’t exnore ExternalImageFormatInfo
nak: Fix TCS output reads
Felix DeGrood (3):
anv: remove CS_FLUSH from query regression
driconf: add Dying Light 2 to Intel XeSS workaround
driconf: add Witcher3 to Intel XeSS workaround
Felix bridault (1):
radv: use 32bit va range for sparse descriptor buffers
Florian Weimer (1):
meson: C type error in strtod_l/strtof_l probe
Francisco Jerez (70):
intel/l3/gfx11+: Add tile cache partition to intel_l3_config struct.
intel/l3: Define helper for obtaining the size of an L3 partition in KB.
intel/l3: Set up L3FullWayAllocationEnable config if ALL partition has over 126 ways.
intel/dg2: Import L3 cache configurations.
intel/mtl: Import L3 cache configurations.
intel/xehp+: Add TBIMR-related genxml definitions.
intel/xehp+: Import algorithm for TBIMR tiling parameter calculation.
intel/xehp+: Add dynamic state flags controlling whether TBIMR is enabled during 3D primitives.
intel/xehp+: Define driconf option for selectively disabling TBIMR.
iris/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.
anv/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.
anv/xehp+: Enable TBIMR in generated draw calls.
intel/xehp: Adjust TBIMR performance chicken bits.
intel/xehp+: Adjust TBIMR batch size based on slice count.
intel/xehp+: Use TBIMR tile box check in order to avoid performance regressions.
intel/xehp: Enable TBIMR by default.
intel/eu/xe2+: Add support for 10-bit SWSB representation on Xe2+ platforms.
intel/fs/xe2+: Add comment reminding us to take advantage of the 32 SBID tokens.
intel/fs/xe2+: Teach SWSB pass about the behavior of double precision instructions.
intel/fs/xe2+: Handle extended math instructions as in-order in SWSB pass.
intel/eu/xe2+: Add definition for size of GRF space on Xe2.
intel/fs/xe2+: Don’t special case SEL_EXEC in inferred_exec_pipe().
intel: Improve N-way pixel hashing computation to handle pixel pipes with asymmetric processing power.
intel/compiler: Add max_polygons FS compilation parameter.
intel/compiler: Add multipolygon dispatch fields to brw_wm_prog_data.
intel/compiler: Add polygon count statistic to brw_compile_stats.
intel/fs: Add separate constructor of fs_visitor for fragment shaders.
intel/fs: Map all GS input attributes to ATTR register number 0.
intel/fs: Map all VS input attributes to ATTR register number 0.
intel/fs: Map all TES input attributes to ATTR register number 0.
intel/fs: Assert fs_reg::nr is always zero for ATTR registers in geometry stages.
intel/fs: Consider ATTR registers with different fs_reg::nr as belonging to disjoint register spaces.
intel/fs: Provide component index explicitly to interp_reg().
intel/fs: Pass builder to per_primitive_reg().
intel/fs: Fix fs_reg::component_size() to handle two-dimensional register regions.
intel/fs: Rework layout of FS vertex setup data in ATTR file to support multi-polygon dispatch.
intel/fs: Don’t copy-propagate ATTR registers in multi-polygon FS shaders when invalid.
intel/compiler: Don’t change types for copies from ATTR file.
intel/fs/gfx12+: Don’t set nir_divergence_single_prim_per_subgroup option for fragment shaders.
intel/fs/gfx12: Don’t consider multipolygon PS to have packed dispatch.
intel/fs: No need to copy null destinations in lower_simd_width.
intel/fs: Fix PS thread payload setup for depth_w_coef_reg.
intel/fs/gfx12: Implement multi-polygon format of back/front-facing flag in PS payload.
intel/fs/gfx12: Implement multi-polygon format of render target array index in PS payload.
intel: Add debug flag for enabling dual-SIMD8 fragment shader dispatch.
intel/compiler: Attempt to build dual-SIMD8 variant of fragment shaders on gfx12+ platforms.
intel/genxml: Add 3DSTATE_PS definitions needed for dual-SIMD8 dispatch on Gfx12+.
intel/gfx12: Enable SIMD8 dispatch in 3DSTATE_PS for FS multipolygon dispatch.
iris/gfx12: Hook up dual-SIMD8 fragment shader dispatch.
anv/gfx12: Hook up dual-SIMD8 fragment shader dispatch.
intel/fs/xe2+: Stop building SIMD8 compute-like shaders (CS/BS/TS/MS).
intel/fs/xe2+: Stop building SIMD8 fragment shaders.
intel/fs/xe2+: Stop building SIMD8 shaders for geometry stages (VS/TCS/TES/GS).
intel/eu/xe2+: Add helpers for constructing registers in 512b units.
intel/fs/xe2+: Implement PS thread payload register offset setup.
intel/fs/xe2+: Fix for new layout of X/Y pixel coordinates in PS payload.
intel/fs/xe2+: Update uses of pixel/sample mask from PS thread payload.
intel/fs/xe2+: Update location of sample ID fields in PS payload.
intel/fs/xe2+: Update poly info PS payload for new multi-polygon dispatch format.
intel/fs: Add support for vector payload values to fetch_payload_reg().
intel/fs/xe2+: Enable new format of barycentrics in PS payload.
intel/fs/xe2+: Update for new layout of vertex setup data in PS payload.
intel/fs/xe2+: Implement support for multi-polygon vertex setup data in PS payload.
intel/fs/xe2+: Implement layout of mesh shading per-primitive inputs in PS thread payloads.
intel/fs: Plumb shader instead of compiler to get_lowered_simd_width() and friends.
intel/fs/xe2+: Lower SIMD width of instructions that access ATTR file from SIMD2x8/4x8 FS.
intel: Add debug flags for enabling Xe2+ multipolygon fragment shader dispatch modes.
intel/fs/xe2+: Attempt to build quad-SIMD8 and dual-SIMD16 FS variants on Xe2+ platforms.
intel/xe2+: Implement fragment shader dispatch state setup.
intel/compiler/xe2: Don’t disassemble non-existent fields.
Frank Binns (4):
pvr: rename some more instances of ‘reserved’ to ‘carveout’ for consistency
include/drm-uapi: add pvr_drm.h
pvr: Add powervr winsys implementation
pvr: alloc WSI memory via GPU when there isn’t a valid display FD
Friedrich Vock (24):
aco: Update printed block kinds
vulkan: Don’t use set_foreach_remove when destroying pipeline caches
radv/ci: Update skips comments
ac/gpu_info: Manually compute L3 size for Navi33
radv: Enable compute dispatch tunneling
radv,vtn,driconf: Add and use radv_rt_ssbo_non_uniform workaround for Crysis 2/3 Remastered
radv/rt: Initialize unused children in PLOC early-exit
radv/rt: bsearch inlined shaders
radv/rt: Free traversal NIR after compilation
radv,aco: Convert 1D ray launches to 2D
radv/rt: Move per-geometry build info into a geometry_data struct
radv/rt: Acceleration structure updates
radv/rt: Add workaround to make leaves always active
radv: Fix shader replay allocation condition
nir: Make is_trivial_deref_cast public
nir: Handle casts in nir_opt_copy_prop_vars
util: Provide a secure_getenv fallback for platforms without it
vulkan: Use secure_getenv for trigger files
aux/trace: Guard triggers behind __normal_user
vtn: Use secure_getenv for shader dumping
mesa/main: Use secure_getenv for shader dumping
radv: Use secure_getenv in radv_builtin_cache_path
radv: Use secure_getenv for RADV_THREAD_TRACE_TRIGGER
util/disk_cache: Use secure_getenv to determine cache directories
GKraats (1):
i915G: show correct number of needed ALU instructions at errmess
Ganesh Belgur Ramachandra (9):
radeonsi: Fix clear-render-target shader for 1darrays in NIR
radeonsi: “create_dma_compute” shader in nir
radeonsi: “create_fmask_expand_cs” shader in nir
radeonsi: “get_blitter_vs” shader in nir
asahi: fixes prevailing ‘-Werror=maybe-uninitialized’ issue
radeonsi: enable nir pass for 64 bit operations
radeonsi: add comments for unpack_2x16* utility functions
radeonsi: convert “create_query_result_cs” shader to nir
radeonsi: convert “gfx11_create_sh_query_result_cs” shader to nir
Georg Lehmann (28):
aco, radv: vectorize f2f16 if rounding mode is rtz
aco: force uniform result for LDS load with uniform address if it can be non uniform
aco: stop using cstdint
aco: namespace aco_opcode
aco: deduplicate instr_class definition
aco: deduplicate Format definition
aco: don’t CSE v_permlane across exec
aco: use null operand for SOPK s_waitcnt
aco: fix detecting sgprs read by SMEM hazard
aco/tests: add some missing scc defs
aco/tests: use correct operand size for some 64bit ops
aco: use lm for carry out in vsub32
aco: add missing scc def for SALU quad broadcast
aco/gfx10+: don’t use v_cmpx with VCC def
aco: use correct operand size for int tg4 wa
aco: add src/def count and size for all ALU opcodes
aco: validate ALU operands and defs
aco/sched: treat p_dual_src_export_gfx11 like export
aco: don’t optimize DPP across more than one block
aco: add test for post-ra DPP clobbered in linear cfg
aco: optimize 32bit fsign by using fmulz with Inf
aco: shrink buffer stores with undef/zero components
aco/gfx12: implement broadcast dmask shrink behavior
aco: apply packed fneg commutatively
aco: fix applying input modifiers to DPP8
aco: clean up fneg/fabs combining
aco: apply fneg/fabs to VOP3P
aco: stop scheduling at p_logical_end
George Ouzounoudis (9):
nvk: Move SET_BLEND_STATE_PER_TARGET to graphics state initialization
nvk: Support extendedDynamicState3ColorBlendEnable
nvk: Support extendedDynamicState3ColorBlendEquation
nvk: Support extendedDynamicState3SampleMask
nvk: Support extended dynamic state for alpha to coverage/one
vulkan: Fix dynamic graphics state enum usage
nvk: Support extended dynamic state for rasterization stream
nvk: Remove pipeline state setting functions
nvk: Support extended dynamic state for tessellation domain origin
Gert Wollny (15):
virgl: Use host reported limits for max outputs
r600: Add callbacks for get_driver_uuid and get_device_uuid
r600: Add experimental get_compute_state_info
r600: Link with libgalliumvl, when enabling rusticl this is needed
r600/sfn: Fixup component count only if intrinsic has it
r600/sfn: Allow skipping backend shader optimization for a subset of shaders
r600/sfn: keep workgroup and invocation ID registers for whole shader
r600/sfn: Fix usage of std::string constructor
r600/sfn: Don’t try to re-use iterators when the set is made empty
zink: Don’t pass a blend state when we have full ds3 support
r600: lower dround_even also on hardware that supports fp64
virgl: Use better reporting for mirror_clamp features
radv: Fix compilation with gcc-13 and tsan enabled
nir/lower_int64: Fix compilation with gcc-13 and tsan enabled
nir/builder: Fix compilation with gcc-13 when tsan is enabled
Giancarlo Devich (1):
nir: Workaround MSVC internal compiler error in ARM64 build
Guilherme Gallo (19):
ci/bin: Use iid instead of SHA in gitlab_gql
ci/bin: Do not forget to add early-stage dependencies
ci/bin: Refactor create_job_needs_dag
ci/lava: Use project_name instead of hardcoded `mesa`
ci/lava: Fix imports formatting
ci/lava: Refactor UART definition building blocks
ci/lava: Create LAVAJobDefinition
ci/lava: Make SSH definition wrap the UART one
ci/lava: Enable SSH by default in fastboot devices
ci/lava: Add unit tests covering job definition
ci/bin: Fix find_dependency function calls
ci/bin: Replace AIOHTTPTransport with RequestsHTTPTransport
ci/bin: gql: make the query cache optional
ci/bin: gql: Log the caching errors
ci/bin: gql: Implement pagination
ci/bin: gql: Improve queries for jobs/stages retrieval
ci/bin: Fix gitlab_gql methods that uses needs DAG
ci/bin: Fix mypy errors in gitlab_gql.py
ci/bin: Print a summary list of dependency and target jobs
Haihao Xiang (1):
anv: Fix typo in transition_color_buffer
Hans-Kristian Arntzen (2):
radv/radeonsi: Forward correct GPU instance to umr.
wsi/x11: Add workaround for Detroit Become Human.
Helen Koike (3):
ci/zink: add spec@ext_timer_query@time-elapsed to flakes
ci/ci_run_n_monitor: abort when target gets skipped
ci: fix python-test dependency error on merge requests
Hyunjun Ko (2):
vulkan/video: fix a typo
anv/video: fix out-of-bounds read
Iago Toral Quiroga (13):
v3d,v3dv: fix MMU error from hardware prefetch after ldunifa
v3d: implement support for PIPE_CAP_NATIVE_FENCE_FD
broadcom: fix scheduling dependencies for SETMSF instruction
v3dv: disallow image stores on VK_KHR_DISPLAY surfaces
v3dv: switch timestamp queries to using BO memory
broadcom: disable perquad tmu loads after discards
broadcom: lower null pointers
v3dv: implement VK_KHR_shader_terminate_invocation
v3dv: implement VK_EXT_shader_demote_to_helper_invocation
v3dv: expose VK_EXT_subgroup_size_control
broadcom/compiler: fix incorrect flags setup in non-uniform if path
broadcom/compiler: fix incorrect flags update for subgroup elect
broadcom/compiler: be more careful with unifa in non-uniform control flow
Ian Romanick (39):
nir/split_vars: Don’t split arrays of cooperative matrix types
nir/lower_packing: Don’t generate nir_pack_32_4x8_split on drivers that can’t handle it
nir/lower_packing: Add lowering for nir_op_unpack_32_4x8
nir/builder: Teach nir_pack_bits and nir_unpack_bits about 32_4x8
intel/vec4: Don’t emit an empty ELSE
intel/compiler: Add basic CFG validation
intel/compiler: Limit scope of cur_endif variable
intel/compiler: Delete bidirectional block links in opt_predicated_break
intel/compiler: Don’t create extra CFG links in opt_predicated_break
intel/compiler: Don’t create extra CFG links when deleting a block
intel/compiler: Don’t promote CFG link types when removing a block
intel/fs: Don’t add MOV instructions to DO blocks in combine constants
intel/compiler: Verify that DO is alone in the block
nir: Handle divergence for decl_reg
intel/fs/xe2+: Pass correct dispatch_width to fs_generator for geometry-processing stages.
intel/cmat: Update get_slice_type for packed slices
intel/cmat: Add lowering for cmat_insert and cmat_extract
intel/cmat: Enable packed formats for unary, length, and construct
intel/cmat: Enable packed formats for binary ops
intel/cmat: Enable packed formats for scalar ops
intel/cmat: Add lowering for cmat_bitcast
intel/cmat: Lower cmat_load and cmat_store
intel/compiler: Initial bits for DPAS instruction
intel/disasm: Disassembly support for DPAS
intel/compiler: Validation for DPAS instructions
intel/fs: Fix scoreboarding for DPAS
intel/fs: DPAS lowering
intel/fs: nir: Add nir_intrinsic_dpas_intel
anv: Add anv_physical_device::has_cooperative_matrix
anv: Set COMPUTE_WALKER systolic mode enable flag
anv: Set PIPELINE_SELECT systolic mode enable flag
anv: Lower indirect derefs again after lowering cooperative matrices
anv: Select the SIMD mode very early when cooperative matrices are used
intel/dev: Advertise integer configs with saturatingAccumulation too
intel/dev: Enable VK_KHR_cooperative_matrix on all Gfx9+ GPUs
intel/cmat: Generate better code for nir_intrinsic_cmat_insert
intel/compiler: Disable DPAS instructions on MTL
intel/compiler: Track lower_dpas flag in brw_get_compiler_config_value
intel/compiler: Track mue_compaction and mue_header_packing flags in brw_get_compiler_config_value
Italo Nicola (4):
panfrost: fix untracked dependency when converting resource modifier
gallium: stop calling resource_copy_region for multisampled copy_image
panfrost: legalize afbc before blitting
panfrost: expose support for EXT_copy_image
Iván Briano (8):
anv: use the right vertexOffset on CmdDrawMultiIndexed
hasvk: ensure we reapply always pipeline dynamic state in runtime state
anv: allow NULL index buffers
anv: remove no longer valid assert
anv: handle VkBindMemoryStatusKHR on buffer/image memory bind
anv: add support for Cmd*DescriptorSet*2KHR
anv: move astc_emu to use descriptors2 calls
anv: enable VK_KHR_maintenance6
Jan Beich (2):
intel: make CLOCK_TAI optional for non-Linux
intel: make CLOCK_BOOTTIME optional for non-Linux
Jani Nikula (7):
nir: add names to some typedef’d structs/enums
nir: drop **< style documentation comments
isl: drop **< style documentation comments
docs: Add docs/header-stubs/README.rst
docs/vulkan: use hawkmoth instead of doxygen
docs/nir: use hawkmoth instead of doxygen
docs/isl: use hawkmoth instead of doxygen
Janne Grunau (4):
gallium: Avoid empty version scripts in pipe-loader
gallium: Fix i915 pipe-loader build
gallium: Do not create pipe-loader version scripts for disabled drivers
asahi: Fix typo in arch check in agx_get_gpu_timestamp
Jesse Natalie (64):
microsoft: Disable post-merge CI for Windows
d3d12: Only set draw params root parameter index for actual draw params
dzn: Implement VK_MSFT_layered_driver
wgl: Take pixelformat color channels into account for choosing a PFD
winsys/gdi: Handle 4444 and 1010102 texture formats
winsys/gdi: Update is_displaytarget_format_supported to reflect reality
d3d12: Don’t support displaytargets that can’t be supported by GDI/DXGI
dzn: Use vk_properties helper
vulkan: Remove no-longer-needed prototypes for ICD entrypoints
vulkan: Consolidate common ICD methods
vulkan: Support loader interface v7
dzn: Fix memory type sorting
microsoft/compiler: Set src/dest nir types on image intrinsics when deducing format
d3d12: Disable common state promotion for non-simultaneous-access textures
d3d12: Initialize shader key swizzle for non-int textures
d3d12: Add a fallback for int clears where value can’t be cast to float
d3d12: Binding buffers as SSBO/storage image needs to add buffer ranges
d3d12: Change memory barrier implementation
d3d12: Support ARB_texture_view
d3d12: Use format casting for shader images
d3d12: GL4.3
microsoft/compiler: Bump signature limits for 32 rows of 4 components
microsoft/compiler: Don’t declare PS output registers split across variables
microsoft/compiler: Don’t use 64-bit types for signature entries
microsoft/compiler: When packing fractional inputs, find a row with space for it
microsoft/compiler: Stop lowering all I/O to temps
d3d12: Fix location_frac_mask bitfield size
d3d12: Split dvec3 interpolatns into devc2 and double
d3d12: Support enhanced layouts for VS inputs
d3d12: Fix GS variant I/O slot counts
d3d12: Enable ARB_enhanced_layouts and ARB_texture_mirror_clamp_to_edge
d3d12: Reference count queries in a batch
d3d12: ARB_query_buffer_object and GL4.4
d3d12: PRIMITIVES_GENERATED for stream > 0 should only be an SO query
d3d12: Handle cull distance as an XFB target
d3d12: Fix MSAA-disabling pass; sample mask should be 0 for helper lanes
d3d12: GL4.5
nir_lower_mem_access_bit_sizes: Fix write-mask-constrained 3-byte stores as atomics
nir: Add a flag to opt_if to prevent fighting with splitting 64bit phis
d3d12: Fixes for QBO shaders
d3d12: Enable some 4.6 extensions that were already implemented
d3d12: GL4.6
nir_lower_mem_access_bit_sizes: Fix assert (bit -> byte size)
microsoft/compiler: Fix lower_mem_access_bit_size callback result
d3d12/driconf: Force on ARB_texture_view for Blender
d3d12: Fix multidimensional array ordering
d3d12: Fix h264 encoder 32-bit build (uint64_t -> size_t)
d3d12: Fix hevc encoder 32-bit build (uint64_t -> size_t)
microsoft/clc: Fix image lowering pass to only erase variables at the end
microsoft/clc: Fix images with multiple derefs for real
microsoft/clc: Add a test which sinks image derefs
microsoft/clc: One more image lowering fix
compiler/clc: Don’t fail to parse SPIR-V if there’s no kernels
microsoft/clc: Flip on capabilities to prevent warning spew
microsoft: Whitespace change to trigger CI
vulkan/wsi: Convert bit tests to bool with != 0
util: Re-implement getenv for Windows
d3d12: Add a debug flag to opt out of singleton behavior
d3d12: Only destroy the winsys during screen destruction, not reset
libgl-gdi: Update wgl test to use a 32bit framebuffer
libgl-gdi: Update wgl test to set debug flags needed for tests
dzn: Fix 3D to 2D image copies
zink: Add ASSERTED to vars that are only used for asserts
mesa: Consider mesa format in addition to internal format for mip/cube completeness
Jianxun Zhang (12):
intel/isl: Add a debug option to override modifer list
intel: Move mod_plane_is_clear_color() into isl
intel/vulkan: Report clear color in subresource layout
intel/vulkan: Allow modifiers supporting fast clear
intel/vulkan: Specify offset when creating aux state tracker
intel/vulkan: Import aux state tracking buffer
intel/vulkan: Remove private binding on fast clear region
intel/vulkan: Use the last 2 dwords of clear color struct
intel/vulkan: Correct a comment about an offset in fast clear
intel/vulkan: Update comment of a workaround of modifiers
intel/vulkan: Add COMPRESSED_CLEAR state in layout translation
intel/isl: Add Gfx 12.x RC_CCS_CC into modifier scores
Job Noorman (5):
ir3: correctly set bit size for 64b constant @load_ubo
nir: add _safe variants of nir_foreach_reg_load/store
ir3: lower 64b registers
nir: add helper to create cursor after all @decl_regs
ir3: lower 64b registers before creating preamble
Jonathan Gray (2):
intel/common: add directory prefix to intel_gem.h include
zink: put sysmacros.h include under #ifdef MAJOR_IN_SYSMACROS
Jordan Justen (25):
intel/l3: Use devinfo->urb.size when cfg urb-size is 0.
anv: Add more space for init_render_queue_state() batch (MTL regression)
intel/dev/wa: Raise error if mesa_defs.json contains unknown platforms
intel/dev: Rename mtl-m to mtl-u
intel/dev: Rename mtl-p to mtl-h
intel/compiler: Define XE2 compiler enum
intel/genxml: Update COMPUTE_WALKER for xe2
iris: Set COMPUTE_WALKER Message SIMD field
anv: Set COMPUTE_WALKER Message SIMD field
intel/genxml: Update INTERFACE_DESCRIPTOR_DATA for xe2
anv, iris: Update INTERFACE_DESCRIPTOR_DATA programming for xe2
iris: xe2 doesn’t have INTERFACE_DESCRIPTOR_DATA::BarrierEnable
intel/genxml: Update 3DSTATE_TE for xe2
isl: Add mocs for xe2
intel/genxml: Add UNIFIED_COMPRESSION_FORMAT enum for xe2
anv, blorp, iris: Update 3DSTATE_PS programming for xe2
anv, blorp, iris, intel/genxml: Update 3DSTATE_VS for xe2
anv, blorp, iris, intel/genxml: Update 3DSTATE_PS_EXTRA for xe2
intel/batch_decoder: Update 3DSTATE_PS decoding for xe2
anv, iris, intel/genxml: Update 3DSTATE_GS for xe2
anv, iris, intel/genxml: Update 3DSTATE_HS for xe2
intel/compiler: Pass max_polygons to copy-prop from fs_visitor.
intel/xe2+: Implement brw_wm_state_simd_width_for_ksp() on Xe2+.
intel/genxml/gfx125: Move L1_CACHE_CONTROL to enum
intel/genxml/gfx125: Move STATE_SURFACE_TYPE to enum
Jordan Petridis (1):
Revert “ci: take microsoft farm offline”
Joshua Ashton (2):
nvk: Hook up driconf for nvk_instance
nvk: Enable KHR_present_id and KHR_present_wait
José Expósito (5):
zink: Fix crash on zink_create_screen error path
zink: fix dereference before NULL check
zink: allow software rendering only if selected
zink: initialize drm_fd to -1
egl/glx: fallback to software when Zink is forced and fails
José Roberto de Souza (56):
anv: Add missing ANV_BO_ALLOC_EXTERNAL flags when calling anv_device_import_bo()
intel: Add more information about the PAT entry used
intel: Update MTL scanout PAT entry
intel: Add a write combining PAT entry
anv: Honor memory coherency of the memory type selected
anv: Move PAT entry selection to common code
anv: Change default PAT entry to WC
anv: Calculate mmap mode based on alloc_flags
anv: Remove anv_bo flags that can be inferred from alloc_flags
iris: Add iris_bufmgr_get_pat_entry_for_bo_flags()
intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp()
anv: Reduce ifdefs in anv_GetCalibratedTimestampsEXT()
anv: Make use of intel_gem_read_correlate_cpu_gpu_timestamp()
intel/common/xe: Re implement xe_gem_read_render_timestamp() with xe_gem_read_correlate_cpu_gpu_timestamp()
anv: Bring back the non optimized version of build_load_render_surface_state_address()
intel: Sync xe_drm.h
intel: Sync xe_drm.h
iris: Change default PAT entry to WC
intel: Rename PAT entries
intel: Share function to do device query in Xe KMD
iris: Check for maximum allowed priority in Xe KMD
anv: Rename ANV_BO_ALLOC_SNOOPED to ANV_BO_ALLOC_HOST_CACHED_COHERENT
anv: Add support all possible cached and coherent memory types
intel: Add PAT entries for gfx12 and newer
intel: Sync xe_drm.h
intel: Enable has_set_pat_uapi for Xe
iris: Prepare iris_heap_to_pat_entry() for discrete GPUs
iris: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs
anv: Prepare anv_device_get_pat_entry() for discrete GPUs
anv: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs
anv: Add heaps for Xe KMD in platforms without LLC
intel/dev: Adjust prefetch_size values for Xe2 engines
anv: Fix vm bind of DRM_XE_VM_BIND_FLAG_NULL
iris: Fix the mmap mode for IRIS_HEAP_DEVICE_LOCAL_PREFERRED
intel: Sync xe_drm.h take 2 part 3
intel/isl: Set mocs.blitter_dst/src for MTL
anv: Fix handling of host_cached_coherent bos in gen9 lp in older kernels
anv: Split ANV_BO_ALLOC_HOST_CACHED_COHERENT into two actual flags
anv: Promote bos to host_cached+host_coherent in platforms with LLC
anv: Avoid unnecessary intel_flush calls
intel/genxml/xe2: Update PIPE_CONTROL
intel/genxml/xe2: Update PIPELINE_SELECT
intel: Sync xe_drm.h final part
anv: Remove libdrm usage from Xe KMD backend
anv: Add ANV_BO_ALLOC_IMPORTED
anv: Replace anv_bo.vram_only by anv_bo.alloc_flags check
anv: Assume that imported bos already have flat CCS requirements satisfied
intel/isl/xe2: Enable route of Sampler LD message to LSC
utils/u_debug: Fix parse of “all,<something else>
anv: Increase ANV_MAX_QUEUE_FAMILIES
anv: Drop useless STATIC_ASSERT in anv_physical_device_init_queue_families()
anv: Simply companion_rcs handling
anv: Add missing anv_measure_submit() calls in Xe KMD backend
anv: Fix anv_measure_start/stop_snapshot() over copy or video engine
anv: Call anv_measure_submit() before anv_cmd_buffer_chain_command_buffers()
anv: Fix PAT entry for userptr in integrated GPUs
Juan A. Suarez Romero (12):
v3d/ci: run V3D GL tests in 64-bits
v3d: use kmsro to create drm screen on real hw
vc4/ci: comment why piglit is disabled
broadcom/ci: separate hiden jobs to -inc.yml files
v3d: include the revision in the device name
ci/baremetal: make BM_BOOTCONFIG optional
ci: do not mount already mounted directories
ci/v3d/vc4: remove explicit modules to load
ci/v3dv: add new failures
ci/v3dv: update results
ci/vc4/v3d: remove some flakes
ci/v3d: add support for rpi5
Julia Zhang (1):
radeonsi: modify binning settings to improve performance
Juston Li (17):
venus: add helper function to get cmd handle
venus: refactor out common cmd feedback functions
venus: support deferred query feedback recording
venus: track/recycle appended query feedback cmds
venus: append query feedback at submission time
venus: switch to unconditionally deferred query feedback
venus: sync protocol for VK_EXT_extended_dynamic_state3
venus: pipeline fixes for VK_EXT_extended_dynamic_state3
venus: enable VK_EXT_extended_dynamic_state3
venus: disable unsupported ExtendedDynamicState3Features
venus: implement vkGet[Device]ImageSparseMemoryRequirements
radv: enable stippledBresenhamLines on GFX9 chips
venus: fix query feedback copy sanitize off by 1
venus: rename buffer cache to buffer reqs cache
venus: use vk_format helper for plane count
venus: support caching image memory requirements
venus: add LRU cache eviction for image mem reqs cache
Kai Wasserbäch (1):
fix: ac/llvm: LLVM 18: remove useless passes, partially removed upstream
Karol Herbst (74):
vtn/opencl: always lower to libclc fmod
rusticl/device: restrict image_buffer_size
rusticl/device: restrict param_max_size further
rusticl/mem: properly set pipe_image_view::access
zink: support CLAMP_TO_BORDER with unnormalized coords
zink: alias nir scratch memory by lowering to common bit_size
zink: emit float controls
zink: lower fisnormal as it requires the Kernel Cap
radv: fix buffers in vkGetDescriptorEXT with size not aligned to 4
rusticl/queue: Only take a weak ref to the last Event
rusticl/device: restrict const max size to 1 << 26 bytes
rusticl/mesa: pass PIPE_BIND_LINEAR in resource_create_texture_from_user
rusticl: handle failed maps gracefully
zink: validate pointer alignment in resource_from_user_memory
zink: handle denorm preserve execution modes
zink: deallocate global_bindings array
zink: emit MemoryAccess flags for coherent global load/stores
rusticl/mesa/screen: do not derefence the entire pipe_screen struct
nir: Stop assuming glsl_get_length() returns 0 for vectors
ir2: Stop assuming glsl_get_length() returns 0 for vectors
nvc0: implement PIPE_CAP_TIMER_RESOLUTION
radeonsi: support importing arbitrary resources
radeonsi: hack for importing 3D textures
rusticl/context: fix importing gl cube maps
docs/features: mark rusticl gl_sharing as done
rusticl/queue: do not send empty lists of event to worker queue
rusticl/queue: fix implicit flushing of queue dependencies
rusticl: only support the matching device for gl_sharing
rusticl/memory: fix new clippy::needless-borrow warning
nir: allow vec derefs on system values
vtn: add hack for system values placed in CrossWorkgroup memory
rusticl/api: workaround DPCPP fetching clSetProgramSpecializationConstant
rusticl: add x11 dependency
rusticl/gl: make GLX support optional
clc: allow debug flag to be read from other files
clc: add dump_llvm debug options
nir/opt_preamble: make load_workgroup_size handling optional
radeonsi: lower relative shuffle subgroup ops
radeonsi: lower 64bit subgroup shuffle to 32 bit
clc: add support for cl_khr_subgroup_shuffle and shuffle_relative
rusticl: implement cl_khr_subgroup_shuffle and shuffle_relative
ci/fedora: bump to meson 1.3.0
rusticl: bump meson req
rusticl: use rust.proc_macro for proc macros
clc: use addMacroDef/Undef instead of -D/-U flags
nak: fix some sm checks for volta
nir/algebraic: add support for custom arguments
nak: add algebraic lowering pass
nak: move nir_lower_subgroups into nak_postprocess_nir
rusticl/kernel: explicitly set rounding modes
radeonsi: fix reg_saved_mask for non graphics contexts
clc: add workaround for clang always defining __IMAGE_SUPPORT_ and __opencl_c_int64
rusticl: do not warn on empty RUSTICL_DEBUG or RUSTICL_FEATURES
rusticl: silence clippy::arc-with-non-send-sync for now
rusticl: fix constant and printf buffer size
rusticl/nir: add missing nir include
rusticl: check rustc version for flags requiring newer rustc/clippy
ci: merge debian-rusticl-testing into debian-testing
zink: lock screen queue on context_destroy and CreateSwapchain
clc: remove code supporting pre llvm-10
zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources
rusticl: specify buffer bindings explicitly
rusticl: add QueueContext to track GPU state
rusticl/queue: release bound constant buffer
rusticl: use real buffer for cb0 for drivers prefering
ci,rusticl: bump meson req to 1.3.1
rusticl/meson: generate bindings for LLVM
rusticl/program: add LLVM functions to cache timestamp
rusticl/llvm: do not include spirv-tools/linker.hpp
rusticl/kernel: run opt/lower_memcpy later to fix a crash
nir: rework and fix rotate lowering
nak/opt_out: fix comparison in try_combine_outs
rusticl/kernel: check that local size on dispatch doesn’t exceed limits
clc: force fPIC for every user when using shared LLVM
Kenneth Graunke (21):
intel/compiler: Delete unused emit_dummy_fs()
intel/compiler: Delete unused repclear shader uniform handling
intel/compiler: Delete repclear shader’s special case for 1 color target
intel/compiler: Drop unused saturate handling in repclear shader
intel/compiler: Convert the repclear shader to use send-from-GRF
intel/compiler: Assert that FS_OPCODE_[REP_]FB_WRITE is for pre-Gfx7
iris: Make an iris_bucket_cache structure and array per heap
iris: Make an iris_heap_is_device_local() helper
iris: Rename heap_flags -> heap in i915_gem_create
iris: Split system memory heap into cached-coherent and uncached heaps
iris: Use 64K BOs for the shader uploader
iris: Align fresh BO allocations to 2MB in size
iris: Ensure virtual addresses are aligned to 2MB for 2MB+ blocks
anv: Implement rudimentary VK_AMD_buffer_marker support
anv: Drop 3/4 of PPGTT size restriction for sys heap size calculation
anv: Don’t report more memory available than the heap size
intel/fs: Allow omitting the destination of A64 untyped atomics
intel/fs: Drop opt_register_renaming()
iris: Initialize bo->index to -1 when importing buffers
iris: Don’t search the exec list if BOs have never been added to one
iris: Skip mi_builder init for indirect draws
Konstantin Seurer (40):
radv: Add RADV_MAX_HIT_ATTRIB_DWORDS
radv/nir: Add radv_nir_lower_hit_attrib_derefs
radv/nir: Handle boolean hit attribs
radv/clang-format: Do not indent C++ modifiers
radv: Add radv_nir_lower_hit_attrib_derefs_tests
radv/sqtt: Fix tracing acceleration structure commands
radv/sqtt: Handle monolithic RT pipelines
radv/rt: Use a helper for inlining non-recursive stages
radv/rt: Skip null checks for small case counts
nir/lower_vars_to_scratch: Remove all unused derefs
drm-shim/nouveau: Set nv_device_info_v0::platform
drm-shim/nouveau: Expose the 2D engine on NV50+
drm-shim/nouveau: Stub mitting ioctls
nvk: Do not preserve metadata after lower_load_global_constant_offset_instr
radv: Add more offsets acceleration_structure_layout
radv/bvh: Stop emitting leaf nodes inside the encoder
nir: Optimize fpow with small constant exponents
radv: Implement VK_KHR_ray_tracing_position_fetch
radv: Make pipeline cache object data generic
radv: Don’t store library stack sizes
radv: Add more ray tracing data to the cache
radv/rt: Skip compiling a traversal shader
radv: Skip compiling chit and miss shaders
radv/rt: Remove useless assert
radv/rt: Use radv_shader for compiled shaders
radv/sqtt: Avoid duplicate stage check
radv/rt: Repurpose radv_ray_tracing_stage_is_compiled
vtn: Remove transpose(m0)*m1 fast path
ac/nir: Export clip distances according to clip_cull_mask
vtn: Handle DepthReplacing correctly
radv/rmv: Fix tracing ray tracing pipelines
radv/rt/rmv: Log pipeline library creation
radv: Use PLOC for TLAS builds
radv: Remove the BVH depth heuristics
radv/rt: Lower ray payloads to registers
vtn: Allow for OpCopyLogical with different but compatible types
ac/llvm: Enable helper invocations for quad OPs
lavapipe: Fix DGC vertex buffer handling
lavapipe: Mark vertex elements dirty if the stride changed
lavapipe: Report the correct preprocess buffer size
Lang Yu (1):
radeonsi: emit SQ_NON_EVENT for GFX11_5
Leo Liu (2):
gallium/vl: match YUYV/UYVY swizzle with change of color channels
radeonsi: fix video processing path without VPE enabled
LingMan (9):
rusticl: Show an error message if the build is attempted with an outdated bindgen version
rusticl: Show an error message if the version of bindgen can’t be detected
rusticl: Directly pass a `&Device` to `Mem::map_image` and `Mem::map_buffer`
rusticl: Only put an Arc around PipeScreen where needed
rusticl: Avoid repeatedly creating Vecs during Platform initialization
rusticl: Turn pointers in enqueue_svm_mem_fill_impl into proper Rust types
rusticl: Turn pointers in enqueue_svm_memcpy_impl into slices
rusticl/api: Add checking wrappers around `slice::from_raw_parts{_mut}`
rusticl: Use the `from_raw_parts` wrappers
Lionel Landwerlin (88):
intel/fs: fix dynamic interpolation mode selection
anv/meson: add missing dependency on the interface header
anv: ensure we reapply always pipeline dynamic state in runtime state
intel/fs: Xe2 fix for ExBSO on UGM
blorp: handle binding table & surface state allocation failures
anv: rename internal heaps
anv: deal with state stream allocation failures
anv: add max_size argument for block & state pools
anv: make sure pools can handle more than 2Gb
anv: fail pool allocation when over the maximal size
anv: use anv_state_pool_state_address for blorp vertex buffer address
anv: fix corner case of mutable descriptor pool creation
anv: dynamically allocate utrace batch buffers
perfetto/pps-producer: add optimized cpu/gpu timestamp correlation support
intel/ds: use improved timestamp correlation if available
isl: disable MCS compression on R9G9B9E5
intel: fix PXP status check
anv: handle protected memory allocation
anv: allow creation of protected queues
anv: Emit protection + session ID on protected command buffers
anv: allow protected GEM context creation
anv: enable protected memory
intel/fs: fix residency handling on Xe2
anv: workaround XeSS for Satisfactory
intel/fs: rerun divergence analysis prior to convert_from_ssa
intel/nir/rt: fix reportIntersection() hitT handling
anv: fix source_hash propagation with libraries
anv: fix missing naming for dirty bit
anv: fix CC_VIEWPORT pointer dirty after blorp/simple-shaders
anv: fix dirty state tracking for 3DSTATE_PUSH_CONSTANT_ALLOC
intel/decoder: handle 3DPRIMITIVE_EXTENDED in accumulated prints
intel/blorp: move Wa_18019816803 out of blorp code
anv: get rid of the duplicate pipeline fields in command buffer state
anv/blorp: move helper function about BTI changes to blorp
intel/perf: fix querying of configurations
intel/fs: fix incorrect register flag interaction with dynamic interpolator mode
intel/fs: reuse set_predicate()
intel/aux_map: introduce ref count of L1 entries
anv: use main image address to determine ccs compatibility
anv: track & unbind image aux-tt binding
anv: remove heuristic preferring dedicated allocations
intel/ds: add trace of buffer markers
intel/tools: add hang_replay tool
intel/hang_replay: add the ability to pass the context image to sim-drm
intel: add error2hangdump tool
intel/aubinator_error_decode: bump max buffers to 1024
intel/error_decode: map i915 gfx12.5 register names to our names
intel/tools: hang viewer/editor
anv: add a sampler state pool
anv: move descriptor set type selection to earlier
anv: make a couple of descriptor function private
anv: add missing push descriptor flush on ray tracing pipelines
anv: set layout printer
anv: use 2 different buffers for surfaces/samplers in descriptor sets
intel/hang_replay: fix compile race with generated files
intel/tools: 32bit compile fixes
vulkan/runtime: retain video session creation flags
anv/video: only report matching memory types for protected sessions
util/u_printf: add a u_printf_ptr() variant
nir: make printf_info (de)serializer available
nir/clone: fix missing printf_info clone
nir: include printfs from linked shaders
nir/divergence: handle printf intrinsic
nir/serialize: untangle printf serialization from a particular stage
nir: fixup nir_printf intrinsic description
anv: fix incorrect queue_family access on command buffer
isl: constify isl_device_get_sample_counts()
anv: get features after initializing drm
anv: switch to use runtime physical device properties infrastructure
anv: promote EXT_vertex_attribute_divisor to KHR
anv: promote EXT_calibrated_timestamps to KHR
isl: drop AUX-TT CCS alignment with INTEL_DEBUG=noccs
anv: wait for CS write completion before executing secondary
isl: further restrict alignment constraints
isl: implement Wa_22015614752
intel/fs: fix depth compute state for unchanged depth layout
anv: remove ANV_ENABLE_GENERATED_INDIRECT_DRAWS variable
anv: fix disabled Wa_14017076903/18022508906
intel/aux_map: fix fallback unmapping range on failure
anv: hide vendor ID for The Finals
anv: fix pipeline executable properties with graphics libraries
anv: implement undocumented tile cache flush requirements
anv: don’t prevent L1 untyped cache flush in 3D mode
anv: add missing alignment for AUX-TT mapping
anv: factor out aux-tt binding logic for future reuse
anv: rename aux_tt image field
anv: retain ccs image binding address
anv: fix transfer barriers flushes with compute queue
Louis-Francis Ratté-Boulianne (4):
panfrost: factor out method to check whether we can discard resource
panfrost: add copy_resource flag to pan_resource_modifier_convert
panfrost: add can_discard flag to pan_legalize_afbc_format
panfrost: Legalize before updating part of a AFBC-packed texture
Luc Ma (1):
loader: Remove a line of unused include
Luca Weiss (1):
freedreno: Enable A305B
Lucas Fryzek (2):
freedreno/drm: Add more APIs to per backend API
gallivm/nir: Load all inputs into indirect inputs array
Lucas Stach (2):
etnaviv: drm: don’t update cmdstream timestamp when skipping submit
etnaviv: disable 64bpp render/sampler formats
Lynne (1):
radv: change queue family order in radv_get_physical_device_queue_family_properties
M Henning (21):
nak: Fix a warn(unused_must_use) by calling drop
nak: Remove MemScope::Cluster
nak: Memory order/scope encodings for Ampere
nak: Specify MemScope on MemOrder::Strong
nak: Bind nir_intrinsic_access
nak: Add MemOrder::Constant
nvk: Use load_global_constant for ubo loads
nak: Add encodings for cache eviction priorities
nak: Set “evict first” from ACCESS_NON_TEMPORAL
nak: Request alignment that matches the load width
nak: Use nir_combined_align
nvk: Fix descriptor alignment offset
nak: Provide robustness info to postprocess_nir
nak: Call nir_opt_load_store_vectorize
nak: Call nir_opt_combine_barriers
nak: Call nir_opt_shrink_vectors
nak: Clamp negative texture array indices to zero
nak: Enable loop unrolling.
nak: Print out an instruction count
nak: Add a jump threading pass
nak: Optimize jumps to fall-through if possible
Marcin Ślusarz (1):
anv: fix minSubgroupSize for xe2
Marek Olšák (199):
radeonsi: initialize perfetto in the right place
ac: add missing gfx11.5 bits
ac/gpu_info: adjust attribute ring size for gfx11
ac/surface: cosmetic changes
ac/surface/tests: cosmetic changes
radeonsi: don’t use nir_optimization_barrier_vgpr_amd with ACO
radeonsi: inline si_allocate_gds and si_add_gds_to_buffer_list
radeonsi: inline si_screen_clear_buffer
radeonsi: remove redundant VS_PARTIAL_FLUSH for streamout
radeonsi: remove AMD_DEBUG=nogfx
radeonsi: rename ctx -> sctx in si_emit_guardband
radeonsi: remove and inline si_shader::ngg::prim_amp_factor
radeonsi: decrease PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS to 1024
radeonsi: cosmetic changes in si_pm4.c
radeonsi: split setting num_threads in si_emit_dispatch_packets
radeonsi: use si_shader_uses_streamout properly
radeonsi: adjust setting PA_SC_EDGERULE once more
radeonsi: various isolated cosmetic changes
radeonsi: move max_dist for MSAA into si_state_msaa.c
radeonsi: cosmetic changes in si_state_viewport.c
radeonsi: cosmetic changes in si_state_binning.c, si_state_msaa.c
radeonsi: move setting registers at the end of si_emit_cb_render_state
ac/gpu_info: split has_set_pairs_packets into context and sh flags
ac/gpu_info,llvm: trivial cosmetic changes
radeonsi: clean up si_set_streamout_targets
radeonsi: upload shaders using a compute queue instead of gfx
radeonsi: rewrite PM4 packet building helpers with less duplication
radeonsi: move buffered_xx_regs into a substructure
radeonsi: rename HAS_PAIRS -> HAS_SH_PAIRS_PACKED
radeonsi: rename radeon_*push_*_sh_reg -> gfx11_*push_*_sh_reg
radeonsi: rewrite gfx11_*push*_sh_reg helpers
radeonsi: restructure blocks in si_setup_nir_user_data
radeonsi: restructure blocks in si_emit_graphics_{shader,compute}_pointers
radeonsi/gfx11: use PKT3_SET_CONTEXT_REG_PAIRS_PACKED for PM4 states
radeonsi: don’t call nir_lower_compute_system_values too many times
radeonsi: don’t check DCC compatibility on chips where it’s no-op
radeonsi: cosmetic changes in si_emit_db_render_state
radeonsi: prettify code around PA_SC_LINE_STIPPLE
radeonsi: move emitting VGT_TF_PARAM into gfx10_emit_shader_ngg
radeonsi: remove num_params variable from gfx10_shader_ngg
radeonsi: move SPI_SHADER_IDX_FORMAT into the preamble (it’s immutable)
radeonsi: adjust the total viewport area
radeonsi/gfx11: use SET_CONTEXT_REG_PAIRS_PACKED for other states
radeonsi/gfx11: don’t set OREO_MODE to fix rare corruption
radeonsi: don’t dma-upload shaders on APUs
radeonsi/ci: update failures for gfx103
st/mesa: disable light_twoside if back faces are culled
glsl/nir: return failure from link_varyings if there is a linker error
nir: add lowering from FS LAYER input to LAYER_ID sysval
nir: return progress from nir_remove_sysval_output
ac/nir: add kill_layer flag to VS/GS/NGG lowering
st/mesa: set pipe_framebuffer_state::layers for PBO blits
radeonsi: clean up si_nir_kill_outputs
radeonsi: don’t allocate output space for LAYER/VIEWPORT before TES and GS
radeonsi: implement gl_Layer in FS as a system value
radeonsi: remove the LAYER output if the framebuffer state has only 1 layer
nir: fix gathering TESS_LEVEL_INNER/OUTER usage with lowered IO
nir: don’t declare illegal varyings in nir_create_passthrough_tcs
nir/print: print PATCH0 and VARn_16BIT names instead of numbers for TCS and TES
gallium/docs: make CAP doc order match definition order
gallium: add PIPE_CAP_PERFORMANCE_MONITOR for GL_AMD_performance_monitor
radeonsi: group equal CAP cases
radeonsi: only expose GL_AMD_performance_monitor on gfx7-10.3
ac: rename ac_parse_ib.c -> ac_ib_parser.c
ac: move the IB parsers into ac_parse_ib.c
ac: add an IB parser that gathers context rolls
mesa: optimize _mesa_matrix_is_identity
mesa: skip checking for identity matrix in glMultMatrixf with glthread
mesa: optimize setting the identity matrix
glthread: add a marker at the end of batches indicating the end
glthread: eliminate push/pop calls in PushMatrix+Draw/MultMatrixf+PopMatrix
glthread: add option to put autogenerated marshal structures in the header file
glapi: rename primcount -> instance_count in a few Draw functions
glthread: use autogenerated marshal structures for custom functions
glthread: rework type reduction and reduce vertex stride params to 16 bits
glapi: only expose GL_EXT_direct_state_access functions to GL compatibility
glthread: don’t do “if (COMPAT)” if the function is not in the GL core profile
glapi: only allow deprecated=”” on non-aliased functions
glthread: pass struct marshal_cmd_DrawElementsUserBuf into Draw directly
mesa: deduplicate glVertexPointer and glNormalPointer vs DSA error checking
glthread: add a string table of function names
radeonsi/gfx11: fix unaligned SET_CONTEXT_PAIRS_PACKED
radeonsi: don’t set non-existent VGT_GS_MAX_PRIMS_PER_SUBGROUP on gfx10
radeonsi: change the low-priority compiler queue to normal priority
radeonsi: update shaders for blend state only if the shader key changed
radeonsi: update shaders for rasterizer state only if the shader key changed
radeonsi: clean up setting poly/line/stipple shader key bits
radeonsi: rewrite how shader key bits dependent on current_rast_prim are updated
radeonsi: rewrite si_get_total_colormask as si_any_colorbuffer_written
radeonsi: in bind_{blend,rs}_state, only call 1 update function per if
radeonsi/gfx11: skip si_set_streamout_enable because it has no effect
radeonsi: execute streamout_begin after cache flushes
radeonsi: don’t print the preamble state separately for GALLIUM_DDEBUG
radeonsi: replace gl_FrontFacing with a constant if one side is always culled
radeonsi: set OOB_SELECT for VBOs in si_create_vertex_elements
radeonsi: group most vertex element fields
radeonsi/gfx11: prefer Wave64 for PS without inputs for better VALU perf
radeonsi/gfx11: disable the shader profile for Medical that forces Wave64
radeonsi/gfx11: disable the shader profile for Medical that disables binning
radeonsi: clean up how debug flags and shader profiles determine the wave size
radeonsi/gfx11: prefer Wave64 for VS/TCS/TES/GS because it’s slightly faster
winsys/amdgpu: bypass GL2 for command buffers
radeonsi: track NIR progress properly for optimizations in si_get_nir_shader
ac,radeonsi: rename pos_inputs -> fragcoord_components
nir,radeonsi: add FLAGS into load_vector_arg_amd to record color input usage
radeonsi: change the signature of si_nir_lower_ps_color_input
radeonsi: gather lowered color inputs for monolithic PS
radeonsi: add PS input info into si_shader_binary_info
radeonsi: don’t include the PARAM_GEN input in si_shader_info
radeonsi: decrease NUM_INTERP if uniform inlining eliminated PS inputs
radeonsi: update comments about uniform inlining
radeonsi: decrease NUM_INTERP if export formats/colormask eliminated PS inputs
util: make BITSET_TEST_RANGE_INSIDE_WORD take a value to compare with
radeonsi: merge context_reg_saved_mask and other_reg_saved_mask into a BITSET
radeonsi: convert depth-stencil-alpha state to tracked registers
radeonsi: convert rasterizer state to tracked registers
ac/gpu_info: fix printing radeon_info after adding VPE
radeonsi: rework how guardband registers are updated to decrease overhead
mesa: fix _mesa_matrix_is_identity
mesa: remove some DrawTransformFeedback duplication
mesa: remove some DrawElementsInstanced duplication
mesa: remove more DrawArrays/Elements duplication
mesa: remove non-relevant 16-year-old comment
st/mesa: make prepare_(indexed_)draw non-static
mesa: inline st_draw_transform_feedback
mesa: call st_prepare_(indexed_)draw before Driver.DrawGallium(MultiMode)
st/mesa: no need to check index_size in st_prepare_indexed_draw anymore
mesa: move index bounds code (st_prepare_indexed_draw) into draw.c
cso: do cso_context inheritance how we do it elsewhere
cso: inline cso_get_pipe_context
mesa: execute an error path sooner in _mesa_validated_drawrangeelements
gallium: add typedef pipe_draw_func matching the draw_vbo signature and use it
ac/llvm: remove code for converting txd from 1D to 2D because NIR does it
ac,radeonsi: require DRM 3.27+ (kernel 4.20+) same as RADV
winsys/amdgpu: don’t return a value from cs_add_buffer
winsys/amdgpu: cosmetic changes in amdgpu_cs_add_buffer
winsys/amdgpu: inline amdgpu_add_fence_dependencies_bo_lists
winsys/amdgpu: use inheritance for the cache_entry BO field
winsys/amdgpu: use inheritance for the real BO
winsys/amdgpu: use inheritance for the sparse BO
winsys/amdgpu: use inheritance for the slab BO
winsys/amdgpu: move lock from amdgpu_winsys_bo into sparse and real BOs
winsys/amdgpu: don’t count memory usage because it’s unused
winsys/amdgpu: change real/slab/sparse_buffers to buffer_lists[3]
winsys/amdgpu: change amdgpu_lookup_buffer to take struct amdgpu_buffer_list
winsys/amdgpu: clean up duplicated code around amdgpu_lookup/add_buffer
winsys/amdgpu: return amdgpu_cs_buffer* from add/lookup_buffer instead of index
winsys/amdgpu: pass amdgpu_buffer_list* to amdgpu_add_bo_fences_to_dependencies
winsys/amdgpu: clean up the rest of the code for cs->buffer_lists
winsys/amdgpu: fix amdgpu_cs_has_user_fence for VPE
winsys/amdgpu: document BO structures
ci: disable the google/freedreno farm because it’s down
glthread: add a missing end-of-batch marker
mesa: micro-improvements in draw.c
st/mesa: restore pipe_draw_info::mode at the end of st_hw_select_draw_gallium
mesa: add a pipe_draw_indirect_info* parameter into the DrawGallium callback
mesa: enable GL_SELECT and GL_FEEDBACK modes for indirect draws
winsys/amdgpu: reduce wasted memory due to the size tolerance in pb_cache
gallium/pb_slab: move group_index and entry_size from pb_slab_entry to pb_slab
iris,zink,winsys/amdgpu: remove unused/redundant slab->entry_size
winsys/amdgpu: rename to amdgpu_bo_slab to amdgpu_bo_slab_entry
winsys/amdgpu: stop using pb_buffer::vtbl
gallium/pb_cache: remove pb_cache_entry::end to save space
gallium/pb_cache: switch time variables to milliseconds and 32-bit type
radeon_winsys: add struct radeon_winsys* parameter into fence_reference
r300,r600,radeon/winsys: always pass the winsys to radeon_bo_reference
winsys/amdgpu: don’t layer slabs, use only 1 level of slabs, it improves perf
winsys/amdgpu: add amdgpu_bo_real_reusable slab for the backing buffer
winsys/amdgpu: remove now-redundant amdgpu_bo_slab_entry::real
winsys/amdgpu: remove va (gpu_address) from amdgpu_bo_slab_entry
winsys/amdgpu: don’t use gpu_address to compute slab entry offset in bo_map
gallium/pb_buffer: define pb_buffer_lean without vtbl, inherit it by pb_buffer
gallium/pb_cache: switch to pb_buffer_lean
gallium/pb_cache: remove pb_cache_entry::mgr
gallium/pb_cache: remove pb_cache_entry::buffer
winsys/radeon: stop using pb_buffer::vtbl
r300,r600,radeonsi: switch to pb_buffer_lean
winsys/amdgpu: allocate 1 amdgpu_bo_slab_entry per cache line
winsys/amdgpu: compute bo->unique_id at pb_slab_alloc, not at memory allocation
winsys/amdgpu: rewrite BO fence tracking by adding a new queue fence system
winsys/amdgpu: rename amdgpu_winsys_bo::bo -> bo_handle
winsys/amdgpu: rename amdgpu_bo_sparse::lock -> commit_lock
winsys/amdgpu: rename amdgpu_bo_real::lock to map_lock
winsys/amdgpu: remove dependency_flags parameter from cs_add_fence_dependency
winsys/amdgpu: implement explicit fence dependencies as sequence numbers
winsys/amdgpu: use pipe_reference for amdgpu_ctx refcounting
winsys/amdgpu: don’t use amdgpu_fence::ctx for fence dependencies
winsys/amdgpu: simplify code using amdgpu_cs_context::chunk_ib
radeonsi/ci: add gfx11 flakes
glthread: don’t unroll draws using user VBOs with GLES
glthread: add proper helpers for call fences
gallium/u_threaded_context: use function table to jump to different draw impls
mesa,u_threaded_context: add a fast path for glDrawElements calling TC directly
gallium/u_threaded: use a dummy end call to indicate the end of the batch
gallium/u_threaded: remove unused param from tc_bind_buffer/add_to_buffer_list
gallium/u_threaded: keep it enabled even if the CPU count is 1
meson: require libdrm_amdgpu 2.4.119
winsys/amdgpu: remove amdgpu_bo_real::gpu_address, use amdgpu_va_get_start_addr
winsys/amdgpu: remove amdgpu_bo_sparse::gpu_address, use amdgpu_va_get_start_addr
Mario Kleiner (1):
v3d: add B10G10R10[X2/A2]_UNORM to format table.
Mark Collins (8):
meson: Only include virtio when DRM available
meson: Only link libvdrm to Turnip with virtio KMD
meson: Update lua wrap to 5.4.6-4
freedreno/rddecompiler: Emit explicit scope for CP_COND_REG_EXEC
freedreno/rddecompiler: Decode ELSE branches using NOPs
freedreno/rddecompiler: Reset buffers after RD_CMDSTREAM_ADDR
freedreno/rddecompiler: Print pkt values in hex
freedreno/rddecompiler: Add ability to read GPU buffer into file
Mark Janes (7):
iris: make shader cache content deterministic
anv: make shader cache content deterministic
intel: remove workaround for preproduction DG2 steppings
intel/dev: improve descriptions of workaround macros.
intel/dev: poison macros for workarounds fixed at a stepping
intel: remove MTL a0 workarounds
intel/dev: update workaround definitions to latest defect status
Mart Raudsepp (1):
docs: Fix typo in OpenGL 3.3 support on Asahi
Martin Roukala (né Peres) (12):
zink/ci: drop the concurrency of the zink-radv-vangogh-valve job
ci/b2c: fix artifact collection
radv/ci: fix `vkcts-navi21-valve` execution
Revert “ci/deqp-runner: turn paths in errors into links”
radv: disable meshShaderQueries on gfx10.3
amd/ci: reduce Renoir’s concurrency to 16
ci/b2c: fix the `cmdline_extra` variable name
ci: disable the valve-kws farm until it can be rebooted
Revert “ci: disable the valve-kws farm until it can be rebooted”
ci: disable mupuf’s farm
ci: disable collabora’s farm which appears to be down
Revert “ci: disable mupuf’s farm”
Mary Guillemard (37):
venus: skip bind sparse info when checking for feedback query
nir: Add AGX-specific doorbell and stack mapping opcodes
agx: Add doorbell and stack mapping opcodes
agx: Handle doorbell and stack mapping intrinsics
asahi: clc: Handle doorbell and stack mapping intrinsics
agx: Add stack load and store opcodes
agx: Implement scratch load/store
agx: Add stack adjust opcode
agx: Emit stack_adjust in the entrypoint
zink: Check for VK_EXT_extended_dynamic_state3 before setting A2C
nak: sm75: Fix panic when encoding MUFU with SQRT and TANH
nak: Make PRMT selection a Src
nak: Add support for fddx and fddy
nak: Add for_each_instr in Shader
nak: Gather global memory usage for ShaderInfo
nak: Fix ALD/AST encoding for vtx and offset
nak: Add a complete wrapper around SPH
nak: Collect information to create SPH
nak: Remove encode_hdr_for_nir
nak: Restructure ShaderInfo
nak: Add geometry shader support
nak: Ensure we allocate one barrier when using BAR.SYNC
nak: Implement VK_KHR_shader_terminate_invocation
nak: Move nir_lower_int64 after I/O lowering
nak: Pass offset to load_frag_w
nak: Rewrite nir_intrinsic_load_sample_pos and implement nir_intrinsic_load_barycentric_at_sample
nir: Add a ldtram_nv intrinsic
nak: Add more bits discovered in SPH
nvk: Implement VK_KHR_fragment_shader_barycentric
nvk: Disable flush on each queries and flush at the end
nvk: Implement VK_EXT_primitives_generated_query
venus: Do not submit batch manually when no feedback is required
nak: Fix NAK_ATTR_CLIP_CULL_DIST_7 wrong value
nak: sm50: Implement FFMA
zink: Force 128 fs input components under Venus for Intel
zink: Initialize pQueueFamilyIndices for image query / create
zink: Always fill external_only in zink_query_dmabuf_modifiers
Matt Turner (11):
r600: Add missing dep on git_sha1.h
util: Include stdint.h in libdrm.h
util: Provide DRM_DEVICE_GET_PCI_REVISION definition
ci/lava: Add firmware-misc-nonfree on amd64
intel: Only validate inst compaction if debugging a shader stage
iris: Only initialize batch decoder if necessary
symbols-check: Add _GLOBAL_OFFSET_TABLE_
nir: Fix cast
nir/tests: Reenable tests that failed on big-endian
util: Add DETECT_ARCH_HPPA macro
util/tests: Disable half-float NaN test on hppa/old-mips
Mauro Rossi (3):
Android.mk: filter out cflags to build with Android 14 bundled clang
Android.mk: disable android-libbacktrace to build with Android 14
Android.mk: be able to build radeonsi without llvm
Max R (3):
virgl: Implement clear_render_target and clear_depth_stencil
ci: Uprev virglrenderer
d3d10umd: Fix compilation
Maíra Canal (22):
v3dv: implement VK_EXT_multi_draw
v3dv: move multisync functions to the beginning of the file
v3dv: allow different in/out sync queues
v3dv: allow set_multisync() to accept more wait syncobjs
drm-uapi: extend interface for indirect CSD CPU job
v3dv: check CPU queue availability
v3dv: create a CPU queue type
v3dv: use the indirect CSD user extension
v3dv: occlusion queries aren’t handled with a CPU job
drm-uapi: extend interface for timestamp query CPU job
v3dv: use the timestamp query user extension
drm-uapi: extend interface for reset timestamp CPU job
v3dv: use the reset timestamp user extension
drm-uapi: extend interface for copy timestamp results CPU job
v3dv: use the copy timestamp query results user extension
drm-uapi: extend interface for the reset performance query CPU job
v3dv: don’t start iterating performance queries at zero
v3dv: use the reset performance query user extension
drm-uapi: extend interface for copy performance query CPU job
v3dv: use the copy performance query results user extension
v3d/v3dv: move V3D_CSD definitions to a separate file
v3dv: enable CPU jobs in the simulator
Michael Catanzaro (1):
util: create parents of disk cache directory if needed
Michael Tretter (1):
egl/wayland: fix formatting and add trailing comma
Michel Dänzer (2):
gallium/dri: Return __DRI_ATTRIB_SWAP_UNDEFINED for _SWAP_METHOD
glx: Handle IGNORE_GLX_SWAP_METHOD_OML regardless of GLX_USE_APPLEGL
Mike Blumenkrantz (48):
zink: don’t block large vram allocations
vulkan/wsi: unify all the image usage flag caps
draw: fix uninit variable false positive
zink: add copy box locking
tc: add non-definitive tracking for batch completion
tc: always track fb attachments
tc: add batch usage tagging to threaded_resource
tc: use strong refs for fb attachment tracking
tc: allow unsynchronized texture_subdata calls where possible
zink: handle unsynchronized image maps from tc
zink: barrier_cmdbuf -> reordered_cmdbuf
zink: assert that transfer_dst is available before doing buf2img
zink: rework cmdbuf submission to be more extensible
zink: add a third cmdbuf for unsynchronized (not reordered) ops
zink: add flag to restrict unsynchronized texture access
zink: add locking for batch refs
zink: enable unsynchronized texture uploads using staging buffers
ci: skip zink vram test
ci: bump VVL to 1.3.269
zink: emit SpvCapabilitySampleRateShading with SampleId
zink: always set VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT for usermem
zink: clamp resolve extents to src/dst geometry
zink: only emit xfb execution mode for last vertex stage
aux/u_transfer_helper: set rendertarget bind for msaa staging resource
zink: unset explicit_xfb_buffer for non-xfb shaders
mesa/st/texture: match width+height for texture downloads of cube textures
zink: add more locking for compute pipelines
radv: correctly return oom from the device when failing to create a cs
zink: make (some) vk allocation commands more robust against vram depletion
zink: check for cbuf0 writes before setting A2C
vk/cmd_queue: exempt more descriptor functions from autogeneration
vulkan: add wrappers for descriptor ‘2’ functions
zink: enforce maxTexelBufferElements for texel buffer sizing
zink: always force flushes when originating from api frontend
vk/cmd_queue: stop using explicit casts
vk/cmd_queue: generate maint6 functions
vk/cmd_queue: fix up indentation a little
lavapipe: maint6 descriptor stuff
lavapipe: maint6
zink: fix buffer rebind early-out check
zink: ignore tc buffer replacement info
vk/cmdbuf: add back deleted maint6 workgraph bits
lavapipe: use pushconstants2 for dgc
lavapipe: fix devenv icd filename
zink: fix separate shader patch variable location adjustment
zink: set more dynamic states when using shader objects
zink: always map descriptor buffers as COHERENT
zink: fix descriptor buffer unmaps on screen destroy
Mohamed Ahmed (4):
nvk: Fix GetImageSubResourceLayout for non-disjoint images
nil: Add support for linear images
nvk: Wire up rendering to linear
nvk: Enable linear images for texturing
Molly Sophia (1):
tu: Fix KHR_present_id and KHR_present_wait being used without initialization
Nanley Chery (11):
iris: Optimize BO_ALLOC_ZEROED for suballocations
iris: Zero the clear color before FCV_CCS_E rendering
iris: Don’t memset the clear color BO during aux init
iris: Simplify get_main_plane_for_plane
iris: Simplify a plane count check in from_handle
iris: Use helpers for generic aux plane importing
iris: Inline import_aux_info
iris: Use common res fields for imported planes
iris: Delay main and aux resource creation on import
isl: Handle MOD_INVALID in clear color plane check
iris: Fix lowered images in get_main_plane_for_plane
Neha Bhende (1):
ntt: lower indirect tesslevels in ntt
Patrick Lerda (1):
glsl/nir: fix gl_nir_cross_validate_outputs_to_inputs() memory leak
Paulo Zanoni (34):
anv: don’t forget to destroy device->vma_mutex
anv: alloc client visible addresses at the bottom of vma_hi
anv/sparse: join multiple bind operations when possible
anv/sparse: join multiple NULL binds when possible
anv/sparse: also print bind->address at dump_anv_vm_bind
intel/genxml: add the Gen12+ TR-TT registers
anv/sparse: extract anv_sparse_bind()
anv: setup the TR-TT vma heap
vulkan: fix potential memory leak in create_rect_list_pipeline()
anv/sparse: allow sparse resouces to use TR-TT as its backend
anv/sparse: fix limits.sparseAddressSpaceSize when using vm_bind
anv/trtt: join L1 writes into a single MI_STORE_DATA_IMM when possible
anv/trtt: also join the L3/L2 writes into a single MI_STORE_DATA_IMM
anv/sparse: drop anv_sparse_binding_data from dump_anv_vm_bind()
anv/sparse: join all submissions into a single anv_sparse_bind() call
anv/sparse: pass anv_sparse_submission to the backend functions
anv/sparse: add ‘queue’ to anv_sparse_submission
anv/trtt: use ‘queue’ from anv_sparse_submission in the backend
anv/sparse: move waiting/signaling syncobjs to the backends
anv/sparse: process image binds before opaque image binds
anv/i915: extract setup_execbuf_fence_params()
anv/xe: allow passing extra syncs to xe_exec_process_syncs()
anv/trtt: don’t wait/signal syncobjs using the CPU anymore
anv/trtt: add struct anv_trtt_batch_bo and pass it around
anv/trtt: add support for queue->sync to the TR-TT batches
anv/trtt: properly handle the lifetime of TR-TT batch BOs
anv: enable sparse by default on i915.ko
anv/sparse: don’t support YCBCR 2x1 compressed formats
anv+zink/ci: document new sparse failures
anv/sparse: reject binds that are not a multiple of the granularity
anv/tr-tt: assert the bind size is a multiple of the granularity
anv/sparse: check if the non-sparse version is supported first
anv/sparse: document USAGE_2D_3D_COMPATIBLE as non-standard too
intel/tools: fix compilation of intel_hang_viewer on 32 bits
Pavel Asyutchenko (1):
mesa/main: allow S3TC for 3D textures
Pavel Ondračka (17):
r300: add late vectorization after nir_move_vec_src_uses_to_dest
r300: small adress register load optimization
r300: nir fcsel/CMP lowering pass for R500
r300: add some more early bool lowering
r300: lower flrp in NIR
r300: fcsel_ge lowering from lowered ftrunc
r300: lower ftrunc in NIR
r300: remove backend CMP lowering
r300: remove backend LRP lowering
r300: mark load_ubo_vec4 with ACCESS_CAN_SPECULATE
r300: fix memory leaks in compiler tests
ci: uprev mesa-trigger container
ci: add r300 RV530 dEQP gles2 CI job
r300/ci: add missing kernel url quotes
r300/ci: switch to b2c v0.9.11
r300/ci: add piglit job
r300: fix reusing of color varying slots for generic ones
Peyton Lee (6):
frontends, va: add new parameters of post processor
amd,radeonsi: add libvpe
amd: add new hardware ip for vpe
amd, radeonsi: add si_vpe.c with helper functions of VPE lib
amd, radeonsi: supports post processing entrypoint
winsys, amdgpu, drm: add VPE submission handle
Phillip Pearson (1):
radeonsi: use PRIu64 instead of %lu for uint64_t formatting
Pierre-Eric Pelloux-Prayer (23):
mesa: restore call to _mesa_set_varying_vp_inputs from set_vertex_processing_mode
radeonsi/ci: update failures
radeonsi: check sctx->tess_rings is valid before using it
Revert “radeonsi: decrease PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS to 1024”
egl/wayland: set the correct modifier for the linear_copy image
radeonsi: use a compute shader to convert unsupported indices format
radeonsi: update guardband if vs_disables_clipping_viewport changes
radeonsi/sqtt: fix RGP pm4 state emit function
radeonsi/sqtt: clear record_counts variable
radeonsi/sqtt: rework pm4.reg_va_low_idx
radeonsi/sqtt: use calloc instead of malloc
radeonsi/sqtt: reformat with clang-format
radeonsi/sqtt: fix capturing indirect dispatches with SQTT
radeonsi/winsys: add cs_get_ip_type function
radeonsi/sqtt: fix emitting SQTT userdata when CAM is needed
radeonsi/sqtt: fix capturing RGP on RDNA3 with more than one Shader Engine
radeonsi/sqtt: handle COMPUTE queues as well
radeonsi: fix extra_md handling with fmask
ac/surface: don’t oversize surf_size
radeonsi: compute epitch when modifying surf_pitch
Revert “ci/radeonsi: disable VA-API testing on raven”
radeonsi: emit cache flushes before draw registers
radeonsi: adjust flags for si_compute_shorten_ubyte_buffer
Qiang Yu (35):
aco: do not fix_exports when separately compiled ngg vs or es
aco: add create_end_for_merged_shader
aco: extend max operands in a instruction to 128
aco: move end program handling to select_shader
aco: stop emit s_endpgm for first stage of merged shader
aco: add aco_is_gpu_supported
radeonsi: add vs prolog args needed by aco ls vgpr fix
radeonsi: fill aco shader info for part mode merged shader
radeonsi: enable aco compilation for merged shader parts
radeonsi: move use_aco to si_screen
radeonsi: move llvm compiler alloc/free into create/destroy funcntion
radeonsi: stop llvm context creation when use aco
radeonsi: move llvm internal header to si_shader_llvm.h
radeonsi: selectively build si llvm compiler create/destroy
radeonsi: selectively build llvm compile
radeonsi: set use_aco when no llvm available
radeonsi: include ac_llvm_util.h when llvm available
radeonsi: disk cache remove llvm dependancy when use aco
radeonsi: does not call llvm init when no llvm available
radeonsi: change compiler name for aco
radeonsi: selectively build llvm files
meson: be able to build radeonsi without llvm
radeonsi: fix piglit image coherency test when use aco
aco,radv: add aco_is_nir_op_support_packed_math_16bit
radeonsi: only vectorize nir ops that aco support
ac/llvm: remove nir_op_*2*mp ops handling
nir: add force_f2f16_rtz option to lower f2f16 to f2f16_rtz
aco,ac/llvm,radeonsi: lower f2f16 to f2f16_rtz in nir
aco: set MIMG unrm for GL_TEXTURE_RECTANGLE
aco: handle GL_TEXTURE_RECTANGLE in tg4_integer_workarounds
radeonsi: add missing args in spi_ps_input_ena when fbfetch output
nir: fix load layer id system_values_read info gather
aco: fix set_wqm segfault when ps prolog
radeonsi: fix legacy merged LS/ES workgroup size for aco compilation
radeonsi: unify elf and raw shader binary upload
Raphaël Gallais-Pou (1):
gallium: add sti DRM entry point
Rhys Perry (55):
nir: add helpers to skip idempotent passes
radv: use NIR_LOOP_PASS helpers
aco: add VALU/SALU/VMEM/SMEM statistics
aco: collect Pre-Sched SGPRs/VGPRs before spilling
radv: call lower_array_deref_of_vec before lower_io_arrays_to_elements
radv: skip radv_remove_varyings for mesh shaders
radv: disable gs_fast_launch=2 by default
aco/tests: fix tests with LLVM 17
aco/tests: fix tests with LLVM 18
aco: workaround LS VGPR initialization bug in RADV prologs
aco: skip LS VGPR initialization bug workaround if the prolog exists
radv: set prolog as_ls if has_ls_vgpr_init_bug=true
docs: fix RADV_THREAD_TRACE_CACHE_COUNTERS default
nir/lower_fp16_casts: correctly round RTNE f64->f16 casts
nir/lower_fp16_casts: add option to split fp64 casts
radeonsi: use nir_lower_fp16_casts
radv: use nir_lower_fp16_casts
aco: remove f16<->f64 conversions
intel/compiler: use nir_lower_fp16_casts
radv: add radv_disable_trunc_coord option
radv: enable radv_disable_trunc_coord for vkd3d-proton/DXVK
ac/gpu_info: update conformant_trunc_coord comment
ac/nir: fix partial mesh shader output writes on GFX11
ac/nir: ignore 8/16-bit global access offset
ac/nir: fix 32-bit offset global access optimization
aco: flush denormals for 16-bit fmin/fmax on GFX8
aco: implement 16-bit fsign on GFX8
aco: implement 16-bit derivatives
aco: implement 16-bit fsat on GFX8
aco: simplify v_mul_* labelling slightly
aco: insert p_end_wqm before p_jump_to_epilog
nir/loop_analyze: skip if basis/limit/comparison is vector
nir/loop_analyze: scalarize try_eval_const_alu
nir/loop_analyze: fix vector basis/limit/comparison
nir/loop_analyze: check min compatibility with comparison
nir/loop_analyze: support umin and {u,i,f}max
nir/loop_analyze: support loops with min/max and non-add incrementation
vulkan/wsi: don’t support present with queues where blit is unsupported
vulkan/wsi: fix win32 compilation
vulkan/wsi: always create command buffer for special blit queues
nir/loop_analyze: remove invariance analysis
aco/tests: use more raw strings
aco: correctly set min/max_subgroup_size for wave32-as-wave64
radv: use CS wave selection for task shaders
radv: remove radv_shader_info’s cs.subgroup_size
nir: add msad_4x8
nir/algebraic: optimize vkd3d-proton’s MSAD
aco: implement msad_4x8
ac/llvm: implement msad_4x8
radv: enable msad_4x8
nir: remove sad_u8x4
radv: do nir_shader_gather_info after radv_nir_lower_rt_abi
nir/lower_non_uniform: set non_uniform=false when lowering is not needed
nir/lower_shader_calls: remove CF before nir_opt_if
aco: fix labelling of s_not with constant
Rob Clark (34):
ci: Only strip debug symbols
tu/msm: Fix timeline semaphore support
tu/virtio: Fix timeline semaphore support
freedreno/drm: Fix race in zombie import
freedreno: Fix modifier determination
freedreno: Handle DRM_FORMAT_MOD_QCOM_TILED3 import
virtio/drm: Split out common virtgpu drm structs
freedreno/drm: Simplify backend mmap impl
virtio: Add vdrm native-context helper
freedreno/drm/virtio: Switch to vdrm helper
tu/drm/virtio: Switch to vdrm helper
freedreno/a6xx: Assume MOD_INVALID imports are linear
freedreno/a6xx: Fix antichamber trace replay assert
Revert “ci/freedreno: disable antichambers trace”
freedreno/a6xx: Don’t set patch_vertices if no tess
freedreno/a6xx: Rework wave input size
freedreno/drm: Fix mmap leak
freedreno: Always attach bo to submit
isaspec: Sort labels with same output
freedreno/drm: Fix zombie BO import harder
freedreno/a6xx: Fix NV12+UBWC import
freedreno: De-duplicate 19.2MHz RBBM tick conversion
freedreno: Fix timestamp conversion
freedreno: Implement PIPE_CAP_TIMER_RESOLUTION
drm-uapi: Sync drm-uapi
freedreno/layout: Add layout metadata
tu: Add metadata support for dedicated allocations
freedreno/drm: Add BO metadata support
freedreno: Add layout metadata support
ci: More context for color_clear skips for Wayland
ci: List specific color_clears skips
ci: Add wayland-dEQP-EGL.functional.render.* skips
ci: Remove per-driver wayland-dEQP-EGL xfails
freedreno/drm/virtio: Fix typo
Robert Foss (3):
egl/surfaceless: Fix EGL_DEVICE_EXT implementation
egl: Add _eglHasAttrib() function
egl/surfaceless: Don’t overwrire disp->Device if using EGL_DEVICE_EXT
Robert Mader (4):
util: Add new helpers for pipe resources
panfrost: Support parameter queries for main planes
vc4/resource: Support offset query for multi-planar planes
v3d/resource: Support offset query for multi-planar planes
Rohan Garg (31):
intel/compiler: migrate WA 14013672992 to use WA framework
blorp,anv,iris: refactor blorp functions into something more generic
iris: Wa 16014538804 for DG2, MTL A0
iris: pull WA 22014412737 into emit_3dprimitive_was
anv: WA 16014538804 for DG2, MTL A0
blorp: WA 16014538804 for DG2, MTL A0
anv: Refactor loading indirect parameters and filling IDD
anv: refactor kernel dispatch to use new common functions
intel/dev: Add a bit for when the HW can do a indirect draw/dispatch unroll
genxml/12.5: Add the EXECUTE_INDIRECT_DRAW instruction
genxml/12.5: Add the EXECUTE_INDIRECT_DISPATCH instruction
anv: Emit EXECUTE_INDIRECT_DRAW when available
anv: Emit a EXECUTE_INDIRECT_DISPATCH when available
iris: Emit a EXECUTE_INDIRECT_DISPATCH when available
anv: memcpy the thread dimentions only when they’re on the CPU
anv: introduce ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH
intel/genxml: Add the preferred slm size enum for xe2
intel: Set a preferred SLM size for LNL
intel/genxml: Update COMPUTE_WALKER_BODY for xe2
intel/genxml: Update IDD for new fields
blorp: set min/max viewport depths to -FLT_MAX/FLT_MAX when EXT_depth_range_unrestricted is enabled
anv: ensure that we clamp only when EXT_depth_range_unrestricted is not enabled
anv: enable VK_EXT_depth_range_unrestricted
iris: Emit EXECUTE_INDIRECT_DRAW when available
intel/compiler: use the proper enum type to store the op
intel/compiler: infer the number of operands using lsc_op_num_data_values
anv: rename anv_create_companion_rcs_command_buffer to anv_cmd_buffer_ensure_rcs_companion
iris,isl: Adjust driver for several commands of clear color (xe2)
intel/fs/xe2+: Lift CPS dispatch width restrictions on Xe2+.
intel/compiler: Update disassembly for new LSC cache enums
anv: untyped data port flush required when a pipeline sets the VK_ACCESS_2_SHADER_STORAGE_READ_BIT
Roland Scheidegger (1):
lavapipe: bump image alignment up to 64 bytes
Roman Stratiienko (5):
v3d: Don’t implicitly clear the content of the imported buffer
u_gralloc: Extract common code from fallback gralloc
u_gralloc: Add QCOM gralloc support
egl/android: Switch to generic buffer-info code
u_gralloc: Add support for gbm_gralloc
Ruijing Dong (12):
radeonsi/vcn: vcn4 encoding interface dummy update
radeonsi/vcn: preparation for enc intra-refresh
radeonsi/vcn: change intra-ref name
radonesi/vcn: enable intra-refresh in vcn encoders
frontends/va: add intra-refresh in VAAPI interface
radesonsi/vcn add qp_map definition
frontends/va: add ROI feature
radeonsi/vcn: ROI feature implementation
radeonsi/vcn: enable ROI feature in vcn.
radeonsi/vcn: ROI capability value initialization.
frontends/va: remove some TODOs in hevc encoding
radeonsi/vcn: update session_info from vcn3 and up.
Ryan Neph (6):
virgl: implemement resource_get_param() for modifier query
venus: add VN_PERF=no_tiled_wsi_image
venus: strip ALIAS_BIT for WSI image creation on ANV
venus: reject multi-plane modifiers for tiled wsi images
venus: add dri option to enable multi-plane wsi modifiers
venus: fix shmem leak on vn_ring_destroy
Sagar Ghuge (24):
iris: Disable auxiliary buffer if MSRT is bound as texture
iris: Disable CCS compression on top of MSAA compression on ACM
isl: Enable MCS compression on ACM platform
anv: Write timestamp using MI_FLUSH_DW on blitter
anv: Avoid emitting PIPE_CONTROL command for copy/video queue
anv: Flush data cache while clearing depth using HIZ_CCS_WT
anv: Add comment to copy image code block
iris: Init aux map state for compute engine
anv,hasvk: Use uint32_t for queue family indices
blorp: Handle stencil buffer compression on blitter engine
anv: Use RCS cmd buffer if blit src/dest has 3 components
intel/compiler: Adjust assertion in lower_get_buffer_size() for Xe2
intel/fs: Adjust destination size for image size intrinsic
intel/fs: Adjust destination size for global load constant on Xe2+
intel/fs: Adjust destination size for load ubo on Xe2+
intel/genxml: Add BCS/VD0 aux table base address register
anv: Handle video/copy engine queue initialization
anv: Invalidate aux map for copy/video engine
iris: Handle aux map init for copy engine
docs: Document INTEL_COPY_CLASS
anv: Enable blitter engine unconditionally on ACM+
iris: No need to emit PIPELINE_SELECT on Xe2+
anv: No need to emit PIPELINE_SELECT on Xe2+
intel/fs: Check fs_visitor instance before using it
Samuel Pitoiset (169):
radv: move RADV_DEBUG_NO_HIZ check in radv_use_htile_for_image()
radv: implement VK_EXT_image_compression_control
radv: advertise VK_EXT_image_compression_control
ac/gpu_info: remove bogus assertion about number of COMPUTE/SDMA queues
radv: dump the pipeline hash to the gpu hang report
radv: fix a synchronization issue with primitives generated query on RDNA1-2
ac/registers: allow to parse GCVM_L2_PROTECTION_FAULT_STATUS
ac/debug: add a helper to print GPUVM fault protection status
radv: use the GPUVM fault protection status helper
radv: remove NGG streamout support for RDNA1-2
radv: remove unnecessary VS_PARTIAL_FLUSH for NGG streamout
ac/nir: remove dead code in nir_intrinsic_xfb_counter_{add,sub}_amd
aco: remove dead code in nir_intrinsic_xfb_counter_{add,sub}_amd
radv/ci: update list of expected failures/flakes for NAVI31
radv: add RADV_DEBUG=nomeshshader
radv/ci: enable RADV_DEBUG=nomeshshader for vkcts-navi31-valve
radv: bind the non-dynamic graphics state from the pipeline unconditionally
radv: adjust binning settings to improve performance on GFX9
radv: fix compute shader invocations query on compute queue on GFX6
radv: emit COMPUTE_PIPELINESTAT_ENABLE for CS invocations on ACE
ci: backport two mesh/task query fixes for VKCTS
radv/ci: document one more flake test
nir: fix inserting the break instruction for partial loop unrolling
radv: add initial VK_EXT_device_fault support
radv: advertise VK_EXT_device_fault
ci: re-apply two mesh/task query fixes for VKCTS
radv: add a helper to determine if it’s possible to preprocess DGC
radv: emit individual SET_SH_REG for inlined push constants with DGC
radv: optimize emitting inlined push constants with DGC
radv: enable DGC preprocessing when all push constants are inlined
radv: restore sampling CPU/GPU clocks before starting SQTT trace
ac/rgp: update dumping queue event records to the capture
radv: add radv_write_timestamp() helper
radv: add support for RGP queue events
radv: add drirc options to force re-compilation of shaders when needed
radv: fix VRS subpass attachment when HTILE can’t be enabled on GFX10.3
radv: fix registering queues for RGP with compute only
radv: set radv_zero_vram=true for Unreal Engine 4/5
radv: fix a descriptor leak with debug names and host base descriptor set
radv: add a missing async compute workaround for Tonga/Iceland
zink/ci: add a manual job on radv-navi31
aco: remove useless nir_intrinsic_load_force_vrs_rates_amd
radv: remove redundant check when forcing VRS rates
radv: check earlier if a graphics pipeline can force VRS per vertex
ac/surface: change tile mode for 3D PRT surfaces with bpp < 64 on GFX6-8
radv: re-enable sparseResidencyImage3D on POLARIS10+
aco: rename color_exports to exports in create_fs_jump_to_epilog()
radv: rename ps_epilog_inputs to colors for PS epilogs
radv: add radv_physical_device::emulate_mesh_shader_queries for GFX10.3
radv: add support for mesh primitives queries on GFX10.3
radv: define new pipeline statistics indices for mesh/task on GFX11
radv: bump the pipeline state query size to 14 on GFX10.3
radv: do not harcode the pipeline stats mask for query resolves
radv: add support for mesh shader invocations queries on GFX10.3
radv: rework gfx10_copy_gds_query() slightly
radv: make some gang functions non-static
radv: add support for task shader invocations queries on GFX10.3
radv: enable meshShaderQueries on GFX10.3
radv/ci: add missing expected failures for mesh queries on VANGOGH
radv: disable TC-compatible HTILE on Tonga and Iceland
radv: add missing FDCC_CONTROL bits for GFX1103 R2
radv: set radv_invariant_geom=true for War Thunder
radv: do not set OREO_MODE to fix rare corruption on GFX11
ci: uprev vkd3d-proton to 2.11
radv/ci: add new flakes for VEGA10
radv: remove useless NIR instructions when emitting IBO with DGC
radv: set the stream VA for DGC graphics
radv: use an indirect draw when IBO isn’t updated as part of DGC
radv: enable DGC preprocessing for IBO
radv: fix bogus interaction between DGC and RT with descriptor bindings
radv: make sure to prefetch the compute shader for DGC
radv: remove radv_pipeline_key::dynamic_color_write_mask
radv: simplify creating image views for src resolve images
radv: stop performing redundant resolves with the HW resolve path
radv: remove unused layers support for the HW/FS resolve paths
radv: only re-initialize DCC for one level for the HW resolve path
radv: adjust assertions for multi-layer resolves with the HW/FS paths
radv: remove never used binds_state for DGC
radv: only initialize the VBO reg if VBOs are bound with DGC
radv: only initialize the VTX base SGPR if non-zero with DGC
radv: add DGC support for mesh shader only
radv: advertise VK_EXT_depth_clamp_zero_one
radv: update the reset stipple pattern mode
radv: change the reset stipple pattern mode for adjacent lines
radv: make sure to reset the stipple line state when it’s disabled
radv: set combinedImageSamplerDescriptorCount to 1 for multi-planar formats
radv: switch to on-demand PS epilogs for GPL
radv: remove unused code for compiling PS epilogs as part of pipelines
aco: export depth/stencil/samplemask in create_fs_jump_to_epilog()
ac/nir: add an option to skip MRTZ exports in ac_nir_lower_ps()
radv: determine if MRTZ needs to be exported via PS epilogs
radv: prepare the PS epilog key for exporting MRTZ on RDNA3
radv,aco: declare PS epilog VGPR arguments for depth/stencil/samplemask
radv: determine and emit SPI_SHADER_Z_FORMAT for PS epilogs
zink/ci: remove skipped tests from the list of expected failures for NAVI31
radv: export MRTZ via PS epilogs when alpha to coverage is dynamic on GFX11
radv: enable extendedDynamicState3AlphaToCoverageEnable on GFX11
zink/ci: skip more tests that run OOM on NAVI31
zink/ci: update list of failures for NAVI31
zink/ci: stop running zink-radv-navi31-valve sequentially
ci: uprev vkd3d-proton to a0ccc383937903f4ca0997ce53e41ccce7f2f2ec
radv: simplify disabling MRT compaction for PS epilogs
vulkan: bump headers/registry to 1.3.273
radv: promote EXT_calibrated_timestamps to KHR
docs: update features.txt for RADV
radv: remove useless check for TC-compat CMASK images during fb emission
radv: stop clearing FMASK_COMPRESS_1FRAG_ONLY for TC-compat CMASK images
vulkan/runtime: promote VK_EXT_vertex_attribute_divisor to KHR
radv: advertise VK_KHR_vertex_attribute_divisor
radv/ci: remove dEQP-VK.mesh_shader.ext.query.* from the lists
radv: emit the task shader in radv_emit_graphics_pipeline()
radv: cleanup ac_nir_lower_ps options
radv: cleanup gathering PS info with/without PS epilogs
radv: cleanup radv_pipeline_generate_ps_epilog_key()
radv: add support for MRT compaction with PS epilogs
radv: fix binding partial depth/stencil views with dynamic rendering
radv: stop asserting some image create info fields
radv: remove some declared but unused functions/macros
radv: add missing HTILE support for fb mip tail workaround
radv: stop checking FMASK for the fb mip tail workaround
radv: move emitting the fb mip tail workaround when rendering begins
radv: remove radv_get_tess_output_topology() declaration
radv: move meta declarations to radv_meta.h
radv: move RADV_HASH_SHADER_xxx flags to radv_pipeline.c
radv: move radv_image_is_renderable() to radv_image.c
radv: move more descriptor related declarations to radv_descriptor_set.h
radv: move radv_depth_clamp_mode to radv_cmd_buffer.c
radv: move more shader related declarations to radv_shader.h
radv: move SI_GS_PER_ES to radv_constants.h
radv: move buffer view related code to radv_buffer_view.c
radv: move image view related code to radv_image_view.c
vulkan: bump headers/registry to 1.3.274
vulkan: drop VK_ENABLE_BETA_EXTENSIONS for video encode layouts
radv/ci: update CI lists for NAVI10,NAVI31 and RENOIR
ci: apply two bugfixes for VKCTS
radv: move radv_{emulate,enable}_rt() to radv_physical_device.c
radv: make a couple of NIR RT functions as static
radv: move radv_rt_{common,shader} files to nir/
radv: move radv_BindImageMemory2() to radv_image.c
radv: add support for VkBindMemoryStatusKHR
radv: rename RADV_GRAPHICS_STAGES to RADV_GRAPHICS_STAGE_BITS
radv: add support for version 2 of all descriptor binding commands
radv: add support for NULL index buffer
radv: advertise VK_KHR_maintenance6
radv: disable FMASK for MSAA images with layers on GFX9
radv: stop clearing CMASK to 0xcc when FMASK is present on GFX9
radv: disable stencil test without a stencil attachment
radv: constify a variable in radv_emit_depth_control()
radv: remove duplicated si_tile_mode_index() function
radv: rename si_make_texture_descriptor() to gfx6_make_texture_descriptor()
radv: remove radv_write_scissors()
radv: drop si_ prefix from all functions
Revert “radv: disable DCC with signedness reinterpretation on GFX11”
radv: stop disabling DCC for mutable with 0 formats on GFX11
radv: do not program COMPUTE_MAX_WAVE_ID (GDS register) on GFX6
radv/winsys: replace ‘<= GFX6’ by ‘== GFX6’
radv: query drirc options in only one place
radv: move dri options to radv_instance::drirc
radv: rework declaring color arguments for PS epilogs
Revert “radv/rt: Lower ray payloads to registers”
radv: do not issue SQTT marker with DISPATCH_MESH_INDIRECT_MULTI
radv: add missing disable_shrink_image_store to the pipeline key
radv: move RADV_HASH_SHADER_KEEP_STATISTICS to radv_pipeline_key
radv: initialize radv_device::disable_trunc_coord earlier
radv: introduce radv_device_cache_key for per-device cache compiler options
radv: move all per-device keys from radv_pipeline_key to radv_device_cache_key
radv: fix indirect dispatches on the compute queue on GFX7
radv: fix indirect draws with NULL index buffer on GFX10
radv: fix segfault when getting device vm fault info
Sarah Walker (3):
pvr: Update AM62 DSS compatible string to match upstream
pvr: csbgen: Add dummy implementation of stream type
pvr: Add command stream and static context state layout to rogue_kmd_stream.xml
Sathishkumar S (1):
frontends/va: use va interface for jpeg partial decode
Sebastian Wick (1):
radeonsi: Destroy queues before the aux contexts
Sergi Blanch Torne (8):
ci: disable Collabora’s LAVA lab for maintance
Revert “ci: disable Collabora’s LAVA lab for maintance”
ci: disable Collabora’s LAVA lab for maintance
Revert “ci: disable Collabora’s LAVA lab for maintance”
Revert “ci: disable collabora farm as it is currently offline”
ci: disable Collabora’s LAVA lab for maintance
Revert “ac/nir: Export clip distances according to clip_cull_mask”
Revert “ci: disable Collabora’s LAVA lab for maintance”
Shuicheng Lin (1):
intel/xe: Correct DRM_XE_EXEC_QUEUE_SET_PROPERTY’s ioctl
Sil Vilerino (76):
d3d12: d3d12_video_buffer_create_impl - Fix resource importing
d3d12: Allow creating d3d12_dxcore_screen from existing ID3D12Device
vl/win32: Add vl_win32_screen_create_from_d3d12_device
gallium/auxiliary: Fix pb_bufmgr_slab.c leak
pipe: Extend get_feedback with additional metadata
pipe: Add PIPE_VIDEO_CAP_ENC_H264_DISABLE_DBK_FILTER_MODES_SUPPORTED
pipe: Add PIPE_VIDEO_CAP_ENC_INTRA_REFRESH_MAX_DURATION
pipe: Add H264 VUI encode params
pipe: Add HEVC VUI encode params
pipe: Add max_slice_bytes for H264, HEVC encoding
frontend/va: Add log2_max_frame_num_minus4 and log2_max_pic_order_cnt_lsb_minus4 for h264enc
frontend/va: Parse VUI H264 parameters
frontend/va: Parse VUI HEVC parameters
frontend/va: Support VAEncMiscParameterMaxSliceSize
meson: add vp9 and av1 codec support options
gallium/vl: Check for VP9 and AV1 meson option support flags
d3d12: Plumb pipe_h264_enc_picture_desc.dbk.disable_deblocking_filter_idc
d3d12: Use log2_max_frame_num_minus4 and log2_max_pic_order_cnt_lsb_minus4 from pipe_pic_params_h264
d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported
d3d12: Disable codecs according to meson video-codecs option
d3d12: Implement H264 VUI Writer
d3d12: Implement HEVC VUI Writer
d3d12: Implement Intra Refresh for H264, HEVC, AV1
d3d12: Support PIPE_VIDEO_CAP_ENC_H264_DISABLE_DBK_FILTER_MODES_SUPPORTED
d3d12: Implement get_feedback with additional metadata
d3d12: fix usage of GetAdapterLuid() in mingw/GCC using ABI helper
ci: Build d3d12 gallium driver in debian-x86_32
pipe: Support inserting new headers on each H264/HEVC IDR frame
pipe: Add get_feedback_fence for encode async waiting on pipe_feedback_fence
pipe: Add fence_get_win32_handle to get HANDLE from pipe_fence_handle
pipe: Add p_video_codec.get_encode_headers for out of band VPS, SPS, PPS
pipe: Add PIPE_VIDEO_FEEDBACK_METADATA_TYPE_AVERAGE_FRAME_QP
pipe: Add PIPE_VIDEO_CAP_ENC_H264_SUPPORTS_CABAC_ENCODE
pipe: Add PIPE_H264_MAX_REFERENCES
frontend/va: Add h264 encode ip_period param
frontend/va: Add VACodedBufferSegment Average QP metadata
frontend/va: Use p_video_codec.get_feedback_fence to report errors on frame submission
vl_winsys_win32: call winsys->destroy(winsys) in error conditions
d3d12: Implement inserting optional new headers on each H264/HEVC IDR frame
d3d12: Do not increase active_seq_parameter_set_id on new SPS. Force PPS on new SPS
d3d12: H264 encode - Allow CONSTRAINED_BASELINE profile to be written in headers
d3d12: Implement get_feedback_fence for encode async waiting on pipe_feedback_fence
d3d12: Implement fence_get_win32_handle to get HANDLE from d3d12_fence
d3d12: Only pass texture dimensions to d3d12_video_encoder_update_current_encoder_config_state
d3d12: Implement d3d12_video_encoder_get_encode_headers for out of band VPS, SPS, PPS
d3d12: Use new pipe h264 encode ip_period param
d3d12: max_frame_poc workaround for infinite GOPs
d3d12: Fix max slice size and max frame size metadata reporting
d3d12: Implement PIPE_VIDEO_FEEDBACK_METADATA_TYPE_AVERAGE_FRAME_QP
d3d12: Autodetect d3d12_video_buffer imported handle/resource format and dimensions when not passed
d3d12: Implement PIPE_VIDEO_CAP_ENC_H264_SUPPORTS_CABAC_ENCODE
d3d12: Detect imported resource buffer unknown format
d3d12: Improve error detection and reporting for video encoder
d3d12: Fix d3d12_tcs_variant_cache_destroy leak in d3d12_context
d3d12: Fix screen->winsys leak in d3d12_screen
d3d12: d3d12_create_fence_win32 - Fix double refcount bump
d3d12: Fix max reference frames reporting when HW does not support B frame
d3d12: Video Encoder - When setting rate control dirty flags take into account rolled back optional configs
d3d12: Video Encoder: Support reporting non contiguous NALU, offsets for frontend extraction
meson: Add all, all_free (default) options for video-codecs option.
d3d12: Fix usage of H264/HEVC specific classes when VIDEO_CODEC_H26XENC not set
d3d12: Fix AV1 video encode 32 bits build
d3d12: Fix typos in d3d12_video_encoder_bitstream_builder_h264
d3d12: Use enc_constraint_set_flags for H264 NALU writing
frontends/va: Parse enc_constraint_set_flags from packed SPS
d3d12: Check video encode codec cap before checking encode profile/level cap
meson: Only build WGL for Windows platform when opengl option is active
d3d12: Bump directx-headers dependency to v611.0 for latest video codecs and features
d3d12: Remove D3D12_SDK_VERSION checks after bumping directx-headers dependency to v611
d3d12: Fix warning C4065 switch statement contains default but no case labels
d3d12: Implement Delta QP ROI In h264, hevc and av1 video encode
d3d12: Report support for PIPE_VIDEO_CAP_ENC_ROI for Delta QP
Revert “d3d12: Only destroy the winsys during screen destruction, not reset”
Revert “d3d12: Fix screen->winsys leak in d3d12_screen”
d3d12: Fix AV1 Encode - log2 rounding for tile_info section
d3d12: Implement cap for PIPE_VIDEO_CAP_ENC_INTRA_REFRESH
Simon Ser (3):
egl: extract EGLDevice setup in dedicated function
egl: move dri2_setup_device() after dri2_setup_extensions()
egl: ensure a render node is passed to _eglFindDevice()
Simon Zeni (2):
EGL: sync files with Khronos
egl: implement EGL_EXT_query_reset_notification_strategy
Sviatoslav Peleshko (23):
nir/loop_analyze: Fix inverted condition handling in iterations calculation
anv: Fix MI_ARB_CHECK calls in generated indirect draws optimization
nir/loop_analyze: Don’t test non-positive iterations count
intel/fs: Don’t optimize DW*1 MUL if it stores value to the accumulator
intel/compiler: Add variable to dump binaries of all compiled shaders
intel/disasm: Print half-float values instead of placeholder
intel/compiler: Set flag reg to 0 when disabling predication
intel/disasm: Print src1_len correctly depending on ExDesc type
intel/fs: Set group 0 for Wa_14010017096 MOV instruction
intel/eu/validate: Validate that the ExecSize is a factor of chosen ChanOff
intel/tools/i965_asm: Add SWSB handling
intel/tools/i965_asm: Handle HF immediates
intel/tools/i965_asm: Handle sync instruction
intel/tools/i965_asm: Allow neg and abs modifiers on accumulator register
intel/tools/i965_asm: Don’t override flag reg from cond modifier
intel/tools/i965_asm: Allow src0 and src2 of ternary instructions to be imm
intel/tools/i965_asm: Implement gfx12 and gfx12.5 send/sendc
intel/tools/i965_asm: Add dp4a and add3 instructions
intel/tools/i965_asm: Don’t set src0 for break and while on gfx12
intel/tools/tests: Fix sends indirect argument in gfx9 test
intel/tools/tests: Unbreak i965_asm tests
intel/tools/tests: Add i965_asm tests for gfx12 and gfx12.5
nir: Use alu source components count in nir_alu_srcs_negative_equal
Sylvain Munaut (1):
mesa/st, dri2, wgl, glx: Restore flush_objects interop backward compat
Tapani Pälli (34):
intel/dev: provide intel_device_info_is_adln helper
iris: add required PC for Wa_14014966230
anv: add current_pipeline for batch_emit_pipe_control
anv: add required PC for Wa_14014966230
intel/dev: fix intel_device_info_is_adln check
iris: handle tile case where cso width, height is zero
anv: skip engine initialization if vm control not supported
iris: add data cache flush for pre hiz op
anv/drirc: add option to disable FCV optimization
drirc: use fake_sparse for Armored Core 6
drirc: Set limit_trig_input_range option for Valheim
iris: implement Wa_18020335297
anv: refactor state emission
anv: implement Wa_18020335297
iris: implement dummy blit for Wa_16018063123
anv: implement dummy blit for Wa_16018063123
mesa: lower EXT_render_snorm version requirement
anv: use slow clear for small surfaces with Wa_18020603990
iris: use slow clear for small surfaces with Wa_18020603990
anv/hasvk/drirc: change anv_assume_full_subgroups to have subgroup size
drirc: setup anv_assume_full_subgroups=16 for UnrealEngine5.1
anv: cleanup, use intel_needs_workaround instead of is_dg2
iris: cleanup, use intel_needs_workaround instead of is_dg2
iris: use intel_needs_workaround with 14015055625
mesa: fix enum support for EXT_clip_cull_distance
drirc/anv: disable FCV optimization for Baldur’s Gate 3
isl: implement Wa_14018471104
iris: use workaround framework for Wa_22018402687
anv: use workaround framework for Wa_22018402687
anv: check for wa 16013994831 in emit_so_memcpy_end
iris: expand pre-hiz data cache flush to gfx >= 125
anv: expand pre-hiz data cache flush to gfx >= 125
iris: replace constant cache invalidate with hdc flush
anv: move *bits_for_access_flags to genX_cmd_buffer
Tatsuyuki Ishi (25):
fast_urem_by_const: #ifdef DEBUG an assertion.
radv: Fix mis-sizing of pipeline_flags in radv_hash_rt_shaders.
radv: Use sizeof(flags) instead of hardcoded size in radv_hash_shaders.
aco: Replace aco_vs_input_state.divisors with bitfields.
radv: Remove last VS prolog reuse logic.
radv, aco: Rework VS prolog key handling.
radv, aco: Inline struct aco_vs_input_state.
radv: Pre-mask misaligned_mask for VS prolog.
radv: Implement helpers for shader part caching.
radv: Use shader part caching helpers for VS prolog and PS/TCS epilog.
zink: Fix missing sparse buffer bind synchronization.
zink: Defer freeing sparse backing buffers.
zink: Fix waiting for texture commit semaphores.
zink: Remove now unused dead_framebuffers.
radv: Remove aspect mask “expansion” for copy_image.
radv: Add workaround to allow sparse binding on gfx queues.
radv: Enable radv_legacy_sparse_binding for DOOM Eternal.
radv/amdgpu: Remove virtual bo dump logic.
radv/amdgpu: Separate the concept of residency from use_global_list.
radv: Simplify shader config assignment.
radv: Move up radv_get_max_waves, radv_get_max_scratch_waves.
radv: Precompute shader max_waves.
radv: Add layer to skip UnmapMemory for Quantic Dream Engine
radv: Recompute max_waves after postprocessing RT config
radv: never set DISABLE_WR_CONFIRM for CP DMA clears and copies
Tele42 (1):
drirc: enable `vk_wsi_force_swapchain_to_current_extent` for “The Talos Principle VR”
Teng, Jin Chung (1):
d3d12: Decode - Adding more supported resolution
Thomas Devoogdt (1):
util: os_same_file_description: fix unknown linux < 3.5 syscall SYS_kcmp
Thomas H.P. Andersen (13):
docs: update nvk extensions
nvk: use nvk_pipeline_zalloc
nouveau: drop unused #includes of tgsi_parse.h
nvk: VK_EXT_color_write_enable
docs: update features.txt for nvk
nvk: loop over stages in MESA order
nvk: add hashing for shaders
nvk: allocatable nvk_shaders
nvk: pipeline shader cache
nvk: VK_EXT_pipeline_creation_feedback
nvk: VK_EXT_pipeline_creation_cache_control
nvk: VK_EXT_shader_module_identifier
docs: update features.txt for nvk
Thong Thai (1):
radeonsi/vcn: remove EFC support for renoir
Timothy Arceri (24):
nir: move build_write_masked_stores() to nir builder
glsl/nir: implement a nir based lower distance pass
glsl: switch to NIR distance lowering pass
glsl: remove now unused lower distance pass
nir: simplify nir_build_write_masked_store()
glsl: drop ir_binop_ubo_load
glsl: add nir based lower_named_interface_blocks()
glsl: use the nir based lower_named_interface_blocks()
glsl: remove GLSL IR lower_named_interface_blocks()
nir: add nir_fixup_deref_types()
glsl: support glsl linking in nir block linker
glsl: use new nir based block linker
glsl: remove now unused GLSL IR block linker
glsl/st: move has_half_float_packing flag to consts struct
glsl/st: move remaining glsl ir lowering to linker
mesa/st: drop additional validate_ir_tree() call
glsl: combine shader stage loops in linker
radeonsi: fix divide by zero in si_get_small_prim_cull_info()
glsl: tidy up validation loop in linker
glsl: remove some unused linker code
glsl: copy precision val of function output params
glsl: add additional lower mediump test
glsl: move glsl ir lowering out of glsl_to_nir()
glsl: add support for inout params to glsl_to_nir()
Timur Kristóf (32):
radv: Remove always false tmz variables from SDMA functions.
radv: Expose radv_get_dcc_max_uncompressed_block_size function.
radv: Implement buffer/image copies on transfer queues.
radv: Add temporary BO for transfer queues.
radv: Implement workaround for unaligned buffer/image copies.
ac: Rename SDMA max copy size macros to reflect SDMA version.
ac: Remove CIK prefix from SDMA opcodes.
ac: Add sdma_version enum and use it for SDMA features.
radv: Use GPU info for determining SDMA metadata support.
radv: Use SDMA version instead of gfx_level where possible.
radv: disable HTILE/DCC for concurrent images with transfer queue if unsupported.
radv: Disable DCC on exclusive images with transfer queue when SDMA doesn’t support it.
radv: Disable HTILE on exclusive images with transfer queues when SDMA doesn’t support it.
radv: Don’t retile DCC on transfer queues.
radv: Implement barriers for transfer queues.
radv: Implement vkCmdFillBuffer on transfer queues.
radv: Implement vkCmdWriteTimestamp2 on transfer queues.
radv: Implement vkCmdWriteBufferMarker2AMD on transfer queues.
radv: Implement buffer copies on transfer queues.
radv: Implement vkCmdUpdateBuffer on transfer queues.
radv: Move SDMA function and struct declarations to a new header.
radv: Unify SDMA surface struct for linear and tiled images.
radv: Refactor and simplify SDMA surface info functions.
radv: Pass radv_sdma_surf from copy functions to SDMA.
radv: Use SDMA surface structs for determining unaligned buffer copies.
radv: Clean up SDMA chunked copy info struct.
radv: Use correct plane and binding index with SDMA.
radv: Correct binding index for transfer buffer-image copies.
radv: Implement image copies on transfer queues.
radv: Implement T2T scanline copy workaround.
radv: Expose transfer queues, hidden behind a perftest flag.
radv: Correctly select SDMA support for PRIME blit.
Vignesh Raman (5):
ci: Add CustomLogger class and CLI tool
ci: copy logging script to install
ci: bare-metal: poe: Create strutured logs
ci: bare-metal: cros-servo: Create strutured logs for a630
ci/freedreno: add FARM variable
Vinson Lee (6):
ac/surface/tests: Remove duplicate variable block_size_bits
nir: Fix decomposed_prmcnt copy-paste error
nvk: Fix tautological-overlap-compare warning
etnaviv: Remove duplicate initializers
ac/rgp: Fix single-bit-bitfield-constant-conversion warning
intel/disasm: Remove duplicate variable reg_file
Violet Purcell (1):
gallium: Fix undefined symbols in version scripts
Vitaliy Triang3l Kuzmin (13):
r600: Move r600_create_vertex_fetch_shader to r600_shader.c
r600: Remove Gallium dependencies in r600_isa
r600: Replace R600_ERR with R600_ASM_ERR in shader code
r600: Remove Gallium dependencies in r600_asm
r600: Split r600_shader.h into common and Gallium parts
r600/sfn: Make r600 header include paths relative
r600/sfn: Split r600_shader_from_nir into common and Gallium parts
r600: Fix outputs typo in print_pipe_info
r600: Replace TGSI I/O semantics with shader_enums
r600/sfn: Change sampler_index to texture_index in buffer txs
r600/sfn: Remove unused sampler reference in emit_tex_lod
nir: Don’t skip lower_alu if only bit_count needs lowering
vulkan: Fix pipeline layout allocation scope
Vlad Schiller (1):
pvr: Fix VK_EXT_texel_buffer_alignment
VladimirTechMan (1):
venus/android: Switch to using u_gralloc
Yiwei Zhang (57):
venus: use common vk_image_format_to_ahb_format helper
venus: use common vk_image_usage_to_ahb_usage helper
venus: tiny refactor of device memory report interface
venus: avoid modifier prop query in vn_android_get_image_builder
venus: use common vk_image as vn_image base
venus: use common vk_device_memory as vn_device_memory base
venus: use common AHB management and export impl
venus: use vk_device_memory tracked export and import handle types
venus: use vk_device_memory tracked size
venus: use vk_device_memory tracked memory_type_index
venus: fix query feedback batch leak and race upon submission
zink: apply can_do_invalid_linear_modifier to Venus
venus: scrub msaa sample mask only with valid msaa state
venus: fix async compute pipeline creation
venus: properly initialize ring monitor initial alive status
venus: add missing shmem pool fini for cs_shmem pool
venus: reduce ring idle timeout from 50ms to 5ms
venus: use STACK_ARRAY to prepare for indirect submission
venus: enable renderer shmem cache dump for cache debug
venus: add ring helper to avoid redundant ring wait requests
venus: use instance allocator for ring allocs
venus: use instance allocator for indirect cs storage alloc
venus: add vn_instance_fini_ring helper
venus: refactor instance creation failure path
venus: move ring monitor to instance for sharing across rings
venus: refactor to add vn_watchdog
venus: further cleanup vn_relax_init to take instance instead of ring
venus: always set reply command stream to avoid seek
venus: make vn_renderer_shmem_pool thread-safe
venus: remove command_dropped tracking
venus: relax ring mutex
venus: move ring shmem into vn_ring
venus: move the rest ring belongings into ring
venus: move ring submission into ring
venus: move the actual ring creation into ring as well
venus: add vn_ring_get_id and hide vn_ring internals entirely
venus: switch to vn_ring as the protocol interface - part 1
venus: switch to vn_ring as the protocol interface - part 2
venus: switch to vn_ring as the protocol interface - part 3
venus: add vn_gettid helper
venus: dispatch background shader tasks to secondary ring
driconfig: add a workaround for Hades (Vulkan backend)
vulkan/wsi/wayland: ensure drm modifiers stored in chain are immutable
venus: clang format fixes
venus: split up the pipeline fix description into self and pnext
venus: refactor to add pipeline info fixes helpers
venus: properly ignore formats in VkPipelineRenderingCreateInfo
meson/vulkan/util: allow venus to drop compiler deps
venus: make tls hint specific to pipeline creation
venus: TLS ring
venus: clean up secondary ring
venus: allow to retrieve pipeline cache on TLS ring
venus: populate oom from ring submit alloc failures
vulkan/wsi/wayland: fix returns and avoid leaks for failed swapchain
venus: fix pipeline layout lifetime
venus: fix pipeline derivatives
venus: fix to respect the final pipeline layout
Yogesh Mohan Marimuthu (10):
winsys/amdgpu: add _dw to max_ib_size variable for code readability
winsys/amdgpu: remove ib_type variable from struct amdgpu_ib
winsys/amdgpu: rename struct amdgpu_ib main variable as main_ib everywhere
winsys/amdgpu: rename ib variable name to chunk_ib
winsys/amdgpu: remove rcs variable from struct amdgpu_ib
winsys/amdgpu: move 125% comment to correct line of code
winsys/amdgpu: rename requested_size_dw to projected_size_dw
winsys/amdgpu: rename ptr_ib_size_inside_ib to is_chained_ib
winsys/amdgpu: rename big_ib_buffer,ib_mapped variables in struct amdgpu_ib
winsys/radeon: remove unused gpu_address variable from struct radeon_cmdbuf
Yonggang Luo (61):
compiler: Implement num_mesh_vertices_per_primitive to match u_vertices_per_prim
treewide: Merge num_mesh_vertices_per_primitive and u_vertices_per_prim into mesa_vertices_per_prim
nir: remove redundant include of gallium headers
nir: #include “util/macros.h” for BITFIELD64_MASK in nir.c
compiler,vulkan,drm-shim: Remove unused include directories from meson.build
nvk: Should use alignment instead of align
microsoft/clc: Using sampler_id instead PIPE_MAX_SHADER_SAMPLER_VIEWS for dxil_lower_sample_to_txf_for_integer_tex
microsoft/clc: Use 128 instead of PIPE_MAX_SHADER_SAMPLER_VIEWS
micosoft: define enum dxil_tex_wrap to avoid the usage of enum pipe_tex_wrap
micosoft: decouple microsoft vulkan driver and compiler from gallium
dzn: Fixes -Werror=incompatible-pointer-type
d3d12,dzn: Simplify the usage of #include <wsl/winadapter.h>
util: Fixes note: the alignment of ‘_Atomic long long int’ fields changed in GCC 11.
glsl: move glsl_get_gl_type into glsl/linker_util.h
meson/win32: There is no need install OpenGL headers on win32
intel: Remove unused ALIGN macro
clover: Rename function align to align_vector to avoid conflict with global align
treewide: Avoid use align as variable, replace it with other names
util,vulkan,mesa,compiler: Generate source files with utf8 encoding from mako template
intel: Generate source file with utf-8 encoding from mako template
zink: Generate source file with utf-8 encoding from mako template
docs: Generate document with utf8 encoding
v3dv: Use correct type VkStencilOp in function translate_stencil_op
broadcom/compiler: Use correct type pipe_logicop for logicop_func in struct v3d_fs_key
broadcom/compiler: remove unused blend in v3d_fs_key
broadcom: remove unused headers include
osmesa: Make osmesa.h compatible with Windows SDK’s GL.h
broadcom/(compiler,common): avoid include of gallium headers in header files
broadcom/compiler: remove include of gallium headers from meson.build
osmesa: Fixes building osmesa.c on windows
meson: Support for both packaging and distutils
dzn: Remove #if D3D12_SDK_VERSION blocks now that 611 is required
ci/msvc: update flex and bison to winflexbison3
ci/msvc: Install graphics tools(DirectX debug layer) easy to stuck, place it at the beginning
ci/msvc: Split install vulkan sdk out of choco
ci/msvc: Rename vs2019 to msvc
ci/msvc: Rename vs to msvc for consistence
ci/msvc: Improve msvc init
ci/msvc: Remove &windows_msvc_image_tag
ci/msvc: Upgrade to vs2022 build tools
ci/msvc: Install msvc2019 only from vs2022
ci/msvc: Install both msvc2019 and msvc2022
ci/msvc: Stick deqp-runner to version v0.16.1
ci/msvc: Stick VK-GL-CTS to specific version 56114106d860c121cd6ff0c3b926ddc50c4c11fd
ci/msvc: Split the install of rust and d3d out of mesa_deps_test.ps1
ci/microsoft: Update the image-tag and image-path for msvc2019/msvc2022
treewide: Replace the include of nir_types.h with glsl_types.h
compiler/glsl: Move glsl specific _mesa_glsl_initialize_types out and glsl_symbol_table of glsl_types.h
intel: Avoid use align as variable, replace it with other names
intel: Use ALIGN_POT instead of ALIGN inside macro define
intel: Cleanup duplicate ALIGN macro defines
intel,crocus,iris: Use align64 instead of ALIGN for 64 bit value parameter
amd: Use align64 instead of ALIGN for 64 bit value parameter
util,compiler: Avoid use align as variable, replace it with other names
panfrost: Avoid use align as variable, replace it with other names
glsl: Fixes glcpp/tests with mingw/gcc
util: Add align_uintptr and use it treewide to replace ALIGN that works on size_t and uintptr_t
nvk: Avoid use align as variable, replace it with alignment
nouveau: Use align64 instead of ALIGN for 64 bit value parameter
etnaviv/drm: Remove redundant ALIGN macro by #include “util/u_math.h”
compiler/spirv: The spirv shader is binary, should write in binary mode
Zhang Ning (2):
iris: use helper util_resource_at_index
lima: Support parameter queries for PIPE_RESOURCE_PARAM_NPLANES
Zhang, Jianxun (5):
intel/genxml: Remove 3DSTATE_CLEAR_PARAMS instruction (xe2)
intel/genxml: update 3DSTATE_WM_HZ_OP instruction (xe2)
intel/genxml: update 3DSTATE_DEPTH_BUFFER instruction (xe2)
intel/isl: update 3DSTATE_STENCIL_BUFFER (xe2)
intel/genxml: Add RENDER_SURFACE_STATE for xe2
antonino (4):
nir: don’t take the derivative of the array index in `nir_lower_tex`
vulkan: use instance allocator for `object_name` in some objects
nir/zink: drop NIH helper in favor of `mesa_vertices_per_prim`
egl: only check dri3 on X11
daoxianggong (1):
zink - Fix for blend color change without blend state change
duncan.hopkins (4):
util: Update util/libdrm.h stubs to allow loader.c to compile on MacOS.
dri: added build dependencies for systems using non-standard prefixed X11 libs.
glx: fix automatic zink fallback loading between hw and sw drivers on MacOS
vulkan: added build dependencies for systems using non-standard prefixed X11 libs.
i509VCB (3):
asahi,docs: add PBE to hardware glossary
asahi: create queue for screen
agx: remove internal agx_device queue
jphuang (1):
dzn: Change dst image layout according to aspect
llyyr (1):
docs: document AMD_DEBUG=noefc and useaco
ratatouillegamer (2):
hasvk: Add Vulkan API version override
hasvk: Enable hasvk override Vulkan API Version for Brawlhalla