Mesa 20.3.0 Release Notes / 2020-12-03

Mesa 20.3.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 20.3.1.

Mesa 20.3.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 20.3.0 implements the Vulkan 1.2 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

2999738e888731531cd62b27519fa37566cc0ea2cd7d4d97f46abaa3e949c630  mesa-20.3.0.tar.xz

New features

  • GL 4.5 on llvmpipe

  • GL_INTEL_blackhole_render on radeonsi

  • GL_NV_copy_depth_to_color for NIR

  • GL_NV_half_float

  • GL_NV_shader_atomic_int64 on radeonsi

  • EGL_KHR_swap_buffers_with_damage on X11 (DRI3)

  • VK_PRESENT_MODE_FIFO_RELAXED on X11

  • GLX_EXT_swap_control for DRI2 and DRI3

  • GLX_EXT_swap_control_tear for DRI3

  • VK_KHR_copy_commands2 on RADV

  • VK_KHR_shader_terminate_invocation on RADV

  • NGG GS support in ACO

  • VK_KHR_shader_terminate_invocation on ANV

  • driconf: add glx_extension_override

  • driconf: add indirect_gl_extension_override

  • VK_AMD_mixed_attachment_samples on RADV (GFX6-GFX7).

  • GL_MESA_pack_invert on r100 and vieux

  • GL_ANGLE_pack_reverse_row_order

  • VK_EXT_shader_image_atomic_int64 on RADV

  • None

Bug fixes

  • [icl,tgl][iris][i965][regression][bisected] piglit failures

  • shader-db valgrind error

  • [AMDGPU NAVI 5700xt] Large parts of the Blender viewport does not render correctly if an object with hair is moved.

  • [aco] problem compiling compute pipeline

  • zink: regression after !7606

  • glcpp test 084-unbalanced-parentheses fails with bison 3.6.y

  • zink+radv: corruption on pre-game menu in quake3

  • panfrost massive glitches apitrace opengl 2.1

  • [radeonsi] After 549ae5f84375dfadb86cfd465f0103acfae3249f commit Firefox Nightly Asan begins crashes

  • Amber test NIR validation failed after spirv_to_nir

  • zink: add detection for wsi_memory_allocate_info usage

  • Follow-up from “nir,spirv: Add generic pointers support”

  • v3d GL_ARB_vertex_array_bgra support

  • iris: glClear with FBO imported from DMA-BUF doesn’t work

  • Fast-clears of GL_ALPHA16 textures are broken on TGL

  • NV50_PROG_USE_NIR=1 doesn’t work for piglit/bin/pbo-teximage ?

  • Follow-up from “st/mesa: Use nir-to-tgsi for builtins if the driver needs TGSI”

  • [spirv-fuzz] Shader causes an assertion failure in nir_opt_large_constants

  • Amber test validate_phi_src

  • Regnum Online UBO break after game update

  • Current mesa git fails to build in multilib environment?

  • radv/aco: Vertex explosion on RPCS3

  • llvmpipe-cl should not run for other drivers

  • Factorio v1.0 - Linux native - 64 bit - OpenGL/radeonsi - completely broken rendering

  • Gnome 3.38 with Xwayland has screen corruption for X11 apps.

  • st/va fails to build on old libva in mesa git

  • sp_state_shader.c:146: undefined reference to `nir_to_tgsi’

  • anv: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.3d* failures

  • anv: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.3d* failures

  • RADV: Death Stranding glitchy sky rendering

  • Crash in glDrawArrays on Intel iris

  • GLX_OML_swap_method not fully supported

  • deinterlace_vaapi=rate=field does not double output’s actual frame rate on AMD

  • Steam game Haydee leans on implementation-dependent behavior

  • ANV: Support 1 million update-after-bind descriptors

  • zink: crash in Blender on start-up

  • vc4 in 20.2-rc has regression causing app to crash

  • [RADV] broken stencil behaviour when using extended dynamic stencil state

  • [RADV/ACO] Star Citizen Lighting/Shadow Issue

  • [RADV] Some bindings seem broken with VK_DYNAMIC_STATE_VERTEX_INPUT_BINDING_STRIDE_EXT

  • [RADV/ACO] ACO build error about SMEM operands

  • Graphics corruption in Super Mega Baseball 2 with RADV on Navi

  • RADV ACO - ground line corruption in Path of Exile with Vulkan renderer

  • omx/tizonia build broken with latest mesa git

  • Request: VK_EXT_transform_feedback on Intel Gen 7

  • iris: Regression in deqp const_write tests

  • [hsw][bisected][regression] gpu hangs on dEQP-VK.subgroups.(shuffle|quad) tests

  • [RADV/LLVM/ACO] Serious Sam 4 crashes after first cutscene with ACO backend + flickering black spots sprout up everywhere

  • TGL B0 Stepping gpu hangs on many dEQP-VK.subgroups.quad nonconst tests

  • [machines without AVX2/F16C][bisected] X server crash, wflinfo crash in mesa CI

  • nir: Mesa regression on Compute shader

  • radv, aco: dEQP-VK.glsl.atomic_operations.*_fragment_reference regressed

  • Commit c6c1fa9a263880 causes corruption in Steam UI

  • [spirv-fuzz] Shader generates a wrong image

  • Running Amber test leads to VK_DEVICE_LOST

  • [Regression][Bisected][20.2][radeonsi] American Truck Simulator continually allocates memory until OOM

  • [radeonsi] bottom mips of height=1 2D texture is uninitialised after upload

  • Missing terrain in Total War: Warhammer

  • anv: dEQP-VK.robustness.robustness2.* failures on gen12

  • AMD VAAPI encoding - applying filters introduces garbled line at the bottom

  • AMD VAAPI HEVC encoding not working correctly on Polaris

  • [RADV] Problems reading primitive ID in fragment shader after tessellation

  • Massive memory leak (at least AMD, others unknown)

  • Substance Painter 6.1.3 black glitches on Radeon RX570

  • [ivb,hsw,byt,bsw][i965][bisected] anv_reloc_list_add: Assertion failure

  • vkCmdCopyImage broadcasts subsample 0 of MSAA src into all subsamples of dst on RADV

  • assert(left <= -1 && top <= -1 && right >= 1 && bottom >= 1) fails in si_emit_guardband

  • Crash in ruvd_end_frame when calling vaBeginPicture/vaEndPicture without rendering anything

  • Release signing key is not readily available

  • [iris][bisected] piglit.spec.nv_copy_depth_to_color.nv_copy_depth_to_color failures

  • VAAPI vaDeriveImage returns VA_STATUS_ERROR_OPERATION_FAILED

  • X-Plane 11 Installer crashes on startup since `glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins`

  • piglit spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test fails on Iris

  • Amber test opt_peel_loop_initial_if: Assertion failed

  • builder_misc.cpp:137:55: error: ‘get’ is not a member of ‘llvm::ElementCount’

  • AVX instructions leak outside of CPU feature check and cause SIGILL

  • Dirt Rally: Flickering glitches on certain foliage since Mesa 20.1.0 caused by MSAA

  • Horizon Zero Dawn graphics corruption with with radv

  • Crusader Kings 3 Crashes at start since commit with !6472

  • pan_resource.c:733:38: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]

  • [BRW] WRC 5 asserts with gallium nine and iris.

  • ci/bare-metal: POWER_GOOD detection broken with reboot rework

  • radv: Corruption in “The Surge 2”

  • [RADV] Detroit: Become Human Demo game lock-ups with RADV

  • Road Redemption certain graphic effects rendered white color

  • gen_state_llvm.h:54:99: error: invalid conversion from ‘int’ to ‘const llvm::VectorType*’ [-fpermissive]

  • Using a shared dEQP build script

  • vulkan/wsi/x11: deadlock with Xwayland when compositor holds multiple buffers

  • [RADV/ACO] Death Stranding cause a GPU hung (*ERROR* Waiting for fences timed out!)

  • lp_bld_init.c:172:7: error: implicit declaration of function ‘LLVMAddConstantPropagationPass’; did you mean ‘LLVMAddCorrelatedValuePropagationPass’? [-Werror=implicit-function-declaration]

  • ci: Use lld or gold instead of ld.bfd

  • Intel Vulkan driver crash with alpha-to-coverage

  • radv: blitting 3D images with linear filter

  • [ACO] Compiling pipelines from RPCS3’s shader interpreter spins forever in ACO code

  • [regression][bisected] nir: nir_intrinsic_io_semantics assert failures in piglit

  • error: ‘static_assert’ was not declared in this scope

  • Intel Vulkan driver assertion with small xfb buffer

  • <<MESA crashed>> Array Index Out of Range with Graphicsfuzz application

  • EGL_KHR_swap_buffers_with_damage support on X11

  • [spirv-fuzz] SPIR-V parsing failed “src->type->type == dest->type->type”

  • radeonsi: radeonsi crashes in Chrome on chromeos

  • [RADV] commit d19bc94e4eb94 broke gamescope with Navi

  • 4e3a7dcf6ee4946c46ae8b35e7883a49859ef6fb breaks Gamescope showing windows properly.

  • anv: crashes in CTS test dEQP-VK.subgroups.*.framebuffer.*_tess_eval

  • Intel Vuikan (anv) crash in copy_non_dynamic_state() when using validation layer

  • [tgl][bisected][regression] GPU hang in The Witcher 3

  • Mafia 3: Trees get rendered incorrectly

  • radv: dEQP-VK.synchronization.op.multi_queue.timeline_semaphore.write_clear_attachments_*_concurrent fail when forcing DCC.

  • Crash on GTA 5 through proton 5.0.9 and GE versions

  • Flickering textures in “Divinity Original Sin Enhanced Edition”

  • Mesa 20.2.0-rc1 fails to build for AMD

  • Assertion failure compiling shader from Zigguart

Changes

Aaron Watry (1):

  • clover: Fix incorrect error check in clGetSupportedImageFormats

Adam Jackson (22):

  • drisw: Port the MIT-SHM check to XCB

  • vulkan: Don’t pointlessly depend on libxcb-dri2

  • docs: Stop claiming to implement OpenVG

  • mesa: Fix GL_CLAMP handling in glSamplerParameter

  • mesa: Generate more errors from GetSamplerParameter

  • wsi/x11: Hook up VK_PRESENT_MODE_FIFO_RELAXED_KHR

  • glx: Use GLX_FUNCTION2 only for actually aliased function names

  • glx: Collect all the non-applegl extensions in the GetProcAddress table

  • glx: Reject glXSwapIntervalMESA greater than INT_MAX

  • glx: Implement GLX_EXT_swap_control for DRI2 and DRI3

  • glx/dri3: Implement GLX_EXT_swap_control_tear

  • glx: Allow depth-30 pbuffers to work without a depth-30 pixmap format

  • wsi/x11: Create a present queue for VK_PRESENT_MODE_FIFO_RELAXED_KHR

  • glx: move __glXGetUST into the DRI1 code

  • glx: Delegate the core of glXGetScreenDriver to the GLX screen vtable

  • glx: Move glXGet{ScreenDriver,DriverConfig} to common code

  • docs/features: Update extensions for softpipe

  • docs/features: Update extensions for swr

  • loader: Print dlerror() output in the failure message

  • mesa: Enable GL_MESA_pack_invert unconditionally

  • mesa: Implement GL_ANGLE_pack_reverse_row_order

  • docs: Add MESA_pack_invert and ANGLE_pack_reverse_row_order

Alejandro Piñeiro (147):

  • v3d/compiler: add v3dv_prog_data_size helper

  • v3d/packet: fix typo on Set InstanceID/PrimitiveID packet

  • v3d: set instance id to 0 at start of tile

  • broadcom/qpu_instr: wait is not a read or write vpm instruction

  • nir/lower_io: don’t reduce range if parent length is zero

  • broadcom/simulator: update to a newer simulator

  • broadcom/common: increase V3D_MAX_TEXTURE_SAMPLERS, add specific OpenGL limit

  • broadcom/compiler: add V3D_DEBUG_RA option

  • v3dv: add v3d vulkan driver skeleton

  • gitlab-ci: add broadcom vulkan driver

  • v3dv: add support for VK_EXT_debug_report

  • v3dv: memory management stubs

  • v3dv: add support to use v3d simulator

  • v3dv/debug: plug v3d_debug

  • v3dv/debug: add v3dv_debug

  • v3dv: stubs for graphics pipeline methods

  • v3dv: Create/DestroyShaderModule implementation

  • v3d/compiler: num_tex_used on v3d_key

  • v3dv/format: add v3dv_get_format_swizzle

  • v3dv: initial CreateGraphicsPipeline/DestroyPipeline implementation

  • v3dv: initial stub for CmdBindPipeline

  • v3dv: CmdSetViewport and CmdSetScissor implementation

  • v3dv/pipeline: start to track dynamic state

  • v3dv/cmd_buffer: init command buffer dynamic state during pipeline bind

  • v3dv/cmd_buffer: emit Scissor packets

  • v3dv/cmd_buffer: emit Viewport packets

  • v3dv/cmd_buffer: emit shader_state packets

  • v3dv/cmd_buffer: start to emit draw packets

  • v3dv/cmd_buffer: add shader source bos to cmd_buffer

  • v3dv: clif format dumping support

  • v3dv/cmd_buffer: cache viewport translate/scale

  • v3dv: add v3dv_write_uniforms

  • v3dv/cmd_buffer: start jobs with CmdBeginRenderPass

  • v3d/compiler: update uses_vid/uses_iid check

  • v3dv/cmd_buffer: emit CFG_BITS

  • v3dv: partial prepack of the gl_shader_state_record

  • v3dv: prepack VCM_CACHE_SIZE

  • v3dv/pipeline: lower fs/vs inputs/outputs

  • v3dv: vertex input support

  • v3dv: provide default values for input attributes

  • v3dv/format: add R32G32B32A32_SFLOAT format

  • v3dv: stubs for Create/DestroyPipelineCache

  • v3d/cmd_buffer: emit flat_shade/noperspective/centroid flags

  • v3dv/pipeline: adding some nir-based linking

  • v3dv/bo: add a bo name

  • v3dv: debug nir shader also after spirv_to_nir

  • v3dv: initial descriptor set support

  • v3dv/descriptor_set: support for array of ubo/ssbo

  • v3dv/pipeline: null check for pCreateInfo->pDepthStencilState

  • v3dv: no need to manually add assembly bo to the job

  • v3d/compiler: handle GL/Vulkan differences in uniform handling

  • v3dv/cmd_buffer: support for push constants

  • v3dv/descriptor: support for dynamic ubo/ssbo

  • v3dv/pipeline: revamp nir lowering/optimizations passes

  • v3dv/pipeline: clean up io lowering

  • v3dv/descriptor: take into account pPushConstantRanges

  • v3dv/device: tweak ssbo/ubo device limits

  • v3dv/cmd_bufffer: rename and split emit_graphics_pipeline

  • v3dv/cmd_buffer: push constants not using descriptor anymore

  • v3dv/uniforms: cleaning up, moving udpate ubo/ssbo uniforms to a function

  • v3dv/pipeline: unify local allocator name

  • v3dv/pipeline: sampler lowering

  • v3dv/descriptor_set: added support for samplers

  • v3dv/uniforms: filling up QUNIFORM_TMU_CONFIG_P0/P1

  • v3dv/pipeline: add support for shader variants

  • v3dv/cmd_buffer: update shader variants at CmdBindDescriptorSets/CmdBindPipeline

  • v3dv/cmd_buffer: allow return in the middle of variant update if needed

  • v3dv/pipeline: fix adding texture/samplers array elements to texture/sampler map

  • v3dv/descriptor_set: support for immutable samplers

  • v3dv/descriptor: move descriptor_map_get_sampler, add and use get_image_view

  • v3dv/descriptor_set: combine texture and sampler indices

  • v3dv/descriptor: handle not having a sampler when combining texture and sampler id

  • v3dv/uniforms: fill up texture size-related uniforms

  • v3dv/format: expose correctly if a texture format is filterable

  • v3dv: handle texture/sampler shader state bo failure with OOM error

  • v3dv: properly return OOM error during pipeline creation

  • v3dv/meta-copy: ensure valid height/width with compressed formats

  • v3dv/cmd_buffer: move variant checking to CmdDraw

  • v3dv/pipeline: support for specialization constants

  • v3dv/descriptor: add general bo on descriptor pool

  • v3dv/descriptor: use descriptor pool bo for image/samplers

  • v3dv/meta-copy: add uintptr_t casting to avoid warning

  • v3dv/bo: adding a BO cache

  • v3dv/bo: add a maximum size for the bo_cache and a envvar to configure it

  • v3dv/bo: add dump stats info

  • v3d/tex: avoid to ask back for a sampler state if not needed

  • v3dv/pipeline: iterate used textures using the combined index map

  • v3dv/pipeline: set load_layer_id to zero

  • v3dv: initial support for input attachments

  • v3dv/descriptors: support for DESCRIPTOR_TYPE_STORAGE_IMAGE

  • v3dv/pipeline: lower_image_deref

  • v3dv/uniforms: support for some QUNIFORM_IMAGE_XXX

  • nir: include texture query lod as one of the ops that requires a sampler

  • v3dv/device: expose support for image cube array

  • v3dv/image: fix TEXTURE_SHADER_STATE depth for cube arrays

  • v3dv/device: add vendorID/deviceID get helpers

  • v3dv/device: get proper device ID under simulator

  • v3dv/device: proper pipeline cache uuid

  • v3dv/pipeline_cache: bare basic support for pipeline cache

  • v3dv/pipeline_cache: cache nir shaders

  • v3dv/pipeline: add basic ref counting support for variants

  • v3dv/pipeline_cache: cache v3dv_shader_variants

  • v3dv/pipeline_cache: support to serialize/deserialize cached NIRs

  • v3dv/pipeline_cache: MergePipelineCaches implementation

  • v3dv/pipeline: provide a shader_sha1 to private ShaderModules

  • v3dv/pipeline_cache: add default pipeline cache

  • v3dv/pipeline: remove custom variant cache

  • v3dv/pipeline: when looking for a variant, check first current variant

  • v3dv/pipeline: pre-generate more that one shader variant

  • v3dv/pipeline: handle properly OUT_OF_HOST_MEMORY error when allocating p_stage

  • v3dv/descriptor: support for UNIFORM/STORAGE_TEXEL_BUFFER

  • v3dv: add v3dv_limits file

  • v3dv/device: fix minTexelBufferOffsetAlingment

  • v3dv/formats: fix exposing FEATURE_UNIFORM/STORAGE_TEXEL_BUFFER_BIT

  • v3dv/uniforms: handle texture size for texel buffers

  • v3dv/descriptor: remove v3dv_descriptor_map_get_image_view

  • v3dv/device: add assert for texture-related limits

  • v3dv/device: warn when the pipeline cache is disabled

  • v3dv/debug: add v3dv_print_v3d_key

  • v3dv/pipeline: fix combined_index_map insertions

  • v3dv/meta: fix hash table insertion

  • broadcom/compiler: allow GLSL_SAMPLER_DIM_BUF on txs emission

  • v3d/simulator: add v3d_simulator_get_mem_size

  • v3dv/device: fix compute_heap_size for the simulator

  • v3dv/pipeline: use derefs for ubo/ssbo

  • v3dv: Call nir_lower_io for push constants

  • v3dv/pipeline: track if texture is shadow

  • v3dv/pipeline: set 16bit return_size for shadows always

  • v3dv/cmd_buffer: set instance id to 0 at start of tile

  • v3d/limits: add line width and point size limits

  • v3dv/device: fix point-related VkPhysicalDeviceLimits

  • v3dv/device: enable largePoints

  • v3dv/meta_copy: handle mirroring z component bliting 3D images

  • v3dv/formats: properly return unsupported for 1D compressed textures

  • v3dv/meta_copy: fix TFU blitting when using 3D images

  • v3dv/pipeline_cache: set a max size for the pipeline cache

  • v3dv/pipeline_cache: extend pipeline cache envvar

  • v3dv/device: Support loader interface version 3.

  • nir/lower_io_to_scalar: update io semantics on per-component inst

  • docs/features: add v3dv driver

  • v3dv/format: use XYZ1 swizzle for three-component formats

  • v3d/format: use XYZ1 swizzle for three-component formats

  • broadcom/compiler: remove v3d_fs_key depth_enabled field.

  • v3dv/util: remove several logging functions

  • v3dv/util: log debug ignored stype only on debug builds

  • v3dv/device: do nothing when asked physical device pci bus properties

  • v3dv/cmd_buffer: missing (uint8_t *) casting when calling memcmp

Alexandros Frantzis (5):

  • tracie: Make tests independent of environment

  • tracie: Produce JUnit XML results

  • gitlab-ci: Enable unit test reports for normal runner traces jobs

  • gitlab-ci: Enable unit test reports for lava traces jobs

  • gitlab-ci: Enable unit test report for arm64_a630_traces

Alyssa Rosenzweig (388):

  • panfrost: Remove blend prettyprinters

  • panfrost: Move format stringify to decode.c

  • pan/decode: Remove shader replacement artefact

  • panfrost: Inline panfrost-misc.h into panfrost-job.h

  • panfrost: Remove panfrost-misc.h

  • panfrost: Don’t export exception_status

  • panfrost: Rename encoder/ to lib/

  • panfrost: Move pandecode into lib/

  • pan/mdg: Separate disassembler and compiler targets

  • pan/bi: Separate disasm/compiler targets

  • panfrost: Reduce bit dependency to disassembly only

  • panfrost: Add panloader/ to .gitignore

  • pan/bi: Drop use of MALI_POSITIVE

  • panfrost: Inline max rt into compilers

  • panfrost: Treat texture dimension as first-class

  • panfrost: Drop compiler cmdstream deps

  • nir/lower_ssbo: Don’t set align_* for atomics

  • gallium/dri2: Support Arm modifiers

  • panfrost: Set `initialized` more conservatively

  • panfrost: Remove hint-based AFBC heuristic

  • panfrost: Introduce create_with_modifier helper

  • panfrost: Use modifier instead of layout throughout

  • panfrost: Account for modifiers when creating BO

  • panfrost: Respect modifiers in resource management

  • panfrost: Import staging routines from freedreno

  • panfrost: Choose AFBC when available

  • panfrost: Implement YTR availability check

  • panfrost: Enable YTR where allowed

  • panfrost: Allocate enough space for tiled formats

  • panfrost: Ensure AFBC slices are aligned

  • panfrost: Implement panfrost_query_dmabuf_modifiers

  • panfrost: Add stub midgard.xml

  • panfrost: Adopt gen_pack_header.py via v3d

  • panfrost: Build midgard_pack.h via meson

  • panfrost: Redirect cmdstream includes through GenXML

  • pan/decode: Add helper to dump GPU structures

  • panfrost: XMLify job_type

  • panfrost: XMLify draw_mode

  • panfrost: XMLify mali_func

  • panfrost: XMLify stencil op

  • panfrost: XMLify wrap modes

  • panfrost: XMLify viewport

  • panfrost: XMLify UBOs

  • panfrost: XMLify stencil test

  • panfrost: Simplify zsa == NULL case

  • panfrost: Simplify depth/stencil/alpha

  • panfrost: Don’t mask coverage mask to 4-bits

  • panfrost: XMLify Midgard samplers

  • panfrost: XMLify Bifrost samplers

  • panfrost: XMLify Midgard textures

  • panfrost: XMLify Bifrost textures

  • panfrost: Drop unused mali_channel_swizzle

  • panfrost: XMLify Block Format

  • panfrost: XMLify MSAA writeout mode

  • panfrost: XMLify exception access

  • panfrost: XMLify enum mali_format

  • panfrost: Set STRIDE_4BYTE_ALIGNED_ONLY

  • panfrost: Drop NXR format

  • panfrost: Squash 22-bit format field in attr_meta

  • panfrost: XMLify mali_channel

  • panfrost: XMLify attributes

  • panfrost: Merge attribute packing routines

  • panfrost: Add XML for attribute buffers

  • panfrost: Use better packs for blits

  • panfrost: Simplify offset fixup proof

  • panfrost: Make attribute-buffer map explicit

  • panfrost: Move attr_meta emission to the draw routine

  • panfrost: Use packs for attributes

  • panfrost: Hoist instance_shift/instance_odd fetch

  • panfrost: Inline panfrost_vertex_instanced

  • panfrost: Use packs for vertex attribute buffers

  • panfrost: Use packs for vertex built-ins

  • panfrost: Reword comment

  • panfrost: Pass varying descriptors by reference

  • panfrost: Factor out general varying case

  • panfrost: Use pack for XFB varying

  • panfrost: Use pack for general varying

  • panfrost: Use MALI_ATTRIBUTE_LENGTH

  • pan/bit: Use packs for Bifrost unit tests

  • panfrost: Remove mali_attr_meta

  • panfrost: Use packs for varying buffers

  • panfrost: Drop hand-rolled pandecode for attribute buffers

  • panfrost: Drop union mali_attr

  • panfrost: Update CI expectations

  • panfrost: Decontextualize rasterizer

  • panfrost: Drop rasterizer null checks in draw calls

  • panfrost: Drop ZSA null checks in draws

  • panfrost: Drop panfrost_invalidate_frame

  • panfrost: Drop QUADS primitive convert

  • panfrost: Hoist add_fbo_bo call

  • panfrost: Remove useless comment

  • panfrost: Hoist assert from bind to create

  • panfrost: Fix WRITES_GLOBAL bit

  • panfrost: Fix shared memory size computation

  • pan/mdg: Ensure barrier op is set on texture

  • pan/mdg: Handle 32-bit offsets from store_shared

  • pan/mdg: Identify barrier out-of-order field

  • pan/mdg: Fix printing of r26 ld/st sources post-RA

  • pan/mdg: Fix auxiliary load/store swizzle packing

  • panfrost: Pre-allocate memory for pool

  • panfrost: Introduce invisible pool

  • panfrost: Avoid minimum stack allocations

  • pan/decode: Don’t try to dereference heap mapping

  • panfrost: Share tiler_heap across batches/contexts

  • panfrost: Drop implicit blend pooling

  • panfrost: Explicitly handle nr_cbufs=0 case

  • panfrost: Drop depth-only case in blend finalize

  • panfrost: Keep finalized blend state constant

  • panfrost: Fix blend leak for render targets 5-8

  • panfrost: Free cloned NIR shader

  • panfrost: Free NIR of blit shaders

  • panfrost: Free hash_to_temp map

  • pan/mdg: Free previous liveness

  • panfrost: Use memctx for sysvals

  • panfrost: Free batch->dependencies

  • panfrost: Pass alignments explicitly

  • panfrost: Fix attribute buffer underallocation

  • panfrost: Don’t overallocate attributes

  • panfrost: Don’t reserve for NPOT w/o instancing

  • panfrost: Reduce attribute buffer allocations

  • panfrost: Fix alignment on Bifrost

  • gallium: Add util_blend_factor_uses_dest helper

  • gallium: Add util_blend_uses_dest helper

  • si: Use util_blend_factor_uses_dest

  • r300: Use util_blend_factor_uses_dest

  • pan/decode: Drop legacy 32-bit job support

  • panfrost: Decode nested structs correctly

  • panfrost: Hoist blend finalize calls

  • panfrost: Separate shader/blend descriptor emits

  • panfrost: XMLify blend flags

  • panfrost: Simplify make_fixed_blend_mode prototype

  • panfrost: Honour load_dest/opaque flags

  • panfrost: XMLify blend equation

  • panfrost: Combine frag_shader_meta_init functions

  • panfrost: Size UBO#0 accurately

  • panfrost: Clamp shader->uniform_count

  • panfrost: Bake the initial tag into the shader pointer

  • panfrost: Specialize compute vs frag shader init

  • panfrost: Rename shader emit functions

  • panfrost: Clean up blend shader errata handling

  • panfrost: Group SFBD state together

  • panfrost: XMLify Midgard properties

  • panfrost: Pack compute Midgard properties

  • panfrost: Use packs for fragment properties

  • panfrost: Use pack for shaderless

  • panfrost: Fold work_count packing for blend shaders

  • panfrost: Simplify bind_blend_state

  • panfrost: Remove midgard1 bitfield

  • panfrost: XMLify bifrost1

  • panfrost: Drop redundant NULL check

  • panfrost: Group SFBD code tighter

  • panfrost: XMLify Bifrost preload

  • panfrost: Identify additional SFBD flags

  • panfrost: Support SHADERLESS mode everywhere

  • panfrost: Quiet pandecode error

  • panfrost: Derive texture/sampler_count from shader

  • panfrost: XMLify beginning of shader descriptor

  • panfrost: Derive UBO count from shader_info

  • panfrost: Pack vertex properties when compiling

  • panfrost: Prepack fragment properties/preload

  • panfrost: Simplify shaderless packing

  • panfrost: Ensure shader-db state is zero-initialized

  • panfrost: Allocate a state uploader

  • panfrost: Upload shader descriptors at CSO create

  • panfrost: Use preuploaded shader descriptors

  • panfrost: XMLify the rest of shader_meta

  • panfrost: Inherit default values from structs

  • panfrost: Use pack for blit shaders

  • panfrost: Use pack for Bifrost test state

  • panfrost: Add optional opaque packs to GenXML

  • panfrost: Use opaque pack for vertex shaders

  • panfrost: Use pack for fragment shaders

  • pan/decode: Use unpacks for state descriptor

  • panfrost: Drop mali_shader_meta

  • panfrost: Add opaque midgard_blend XML

  • panfrost: Emit explicit REPLACE for disabled colour writeout

  • panfrost: Drop blend indirection

  • panfrost: Add padded type for instance fields

  • panfrost: Add XML for mali_vertex_tiler_postfix

  • panfrost: Use draw pack for blit

  • panfrost: Separate postfix from emits

  • panfrost: Inline vt_update_{rasterizer, occlusion}

  • panfrost: Remove postfix parameter from UBO upload

  • panfrost: Avoid postfix dep for vertex_data

  • panfrost: Don’t call panfrost_vt_init for compute

  • panfrost: Inline panfrost_vt_init

  • panfrost: Inline panfrost_vt_set_draw_info

  • panfrost: Detangle postfix from varying emits

  • panfrost: Use draw pack for compute jobs

  • panfrost: Use pack for draw descriptor

  • panfrost: Simplify ZSA bind

  • panfrost: Cleanup point sprite linking

  • panfrost: Drop point sprite from shader key

  • panfrost: XMLify primitive information

  • panfrost: Add invocation XML

  • panfrost: XMLify invocations

  • panfrost: Drop bifrost_payload_fused

  • panfrost: Inline bifrost_tiler_only

  • panfrost: Use nir_builder_init_simple_shader for blits

  • pan/decode: Drop scratchpad size dump

  • pan/decode: Drop mali_vertex_tiler_postfix arg

  • pan/decode: Print shader-db even for compute

  • pan/decode: Fix awkward syntax

  • pan/decode: Use generation for vertex_tiler_postfix

  • pan/decode: Use unpack for vertex_tiler_postfix_pre

  • panfrost: Remove mali_vertex_tiler_postfix

  • pan/decode: Drop prefix braces

  • panfrost: Emit texture/sampler points for compute

  • pan/mdg: Implement i/umul_high

  • pan/mdg: Scalarize 64-bit

  • pan/mdg: Bounds check swizzle writing globals

  • pan/mdg: Implement nir_intrinsic_load_sample_mask_in

  • pan/mdg: Refactor texture op/mode handling

  • pan/mdg: Add disassembly for shadow gathers

  • pan/mdg: Implement texture gathers

  • panfrost: Set PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS

  • docs/features: Add missing Panfrost extensions

  • pan/mdg: Fix discard encoding

  • pan/mdg: Fix perspective combination

  • panfrost: Drop PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER/BUFFERS

  • mesa/st: Don’t set alpha if ALPHA_TEST is lowered

  • pan/mdg: Obey f2fmp size restriction in fuse_io_16

  • panfrost: Fix nonzero stencil mask on vertex/compute

  • pan/bit: Set d3d=true for CMP tests

  • pan/bit: Fix unit tests

  • pan/bi: Lower flrp16

  • pan/bi: Add XML describing the instruction set

  • pan/bi: Add ISA parser

  • pan/bi: Add packing generator

  • pan/bi: Add disassembler generator

  • pan/bi: Add disassembly prototypes

  • pan/bi: Add bi_disasm_dest_* helpers

  • pan/bi: Export dump_src

  • pan/bi: Use new disassembler

  • pan/bi: Use canonical syntax for registers/uniforms/imms

  • pan/bi: Use canonical syntax for special constants

  • pan/bi: Add dummy carry/borrow argument for iadd/isub

  • pan/bi: Introduce segments into the IR

  • pan/bi: Add format field to IR

  • pan/bi: Track compute_lod in IR

  • pan/bi: Pass blend descriptor explicitly in IR

  • pan/bi: Use 8-bit shifts

  • pan/bi: Use src1/dest_invert instead of src_invert[]

  • pan/bi: Move packing helpers to dedicated file

  • pan/bi: Use new packing

  • pan/bi: Remove unused prints

  • pan/bi: Remove unused packing data structures

  • pan/bi: Drop *FMIN reference

  • pan/bi: Annotate stop bit (canonically “Z-bit”)

  • pan/bi: Annotate disassemble with format names

  • pan/bi: Inline dump_instr

  • pan/bi: Track M values of disassembled constants

  • pan/bi: Decode M values in disasm

  • pan/bi: Disassemble PC-relative addresses

  • pan/bi: Add bifrost_reg_mode enum

  • pan/bi: Pass ‘first’ through disassembler

  • pan/bi: Decode all 32-bit register modes

  • pan/bi: Rename port -> slot

  • pan/bi: Use canonical register packing

  • pan/bi: Remove old register mode definitions

  • pan/bi: Fix assert when writing vertex outputs

  • pan/bi: Add copy for register COMBINEs

  • pan/decode: Ensure mappings are zeroed

  • pan/bi: Fix memory corruption in scheduler

  • pan/bi: Drop if 0’d combine lowering

  • pan/bi: Cull unnecessary edges on the CF graph

  • pan/bi: Use canonical floating-point modes

  • pan/bi: Canonicalize terminate_discarded_threads

  • pan/bi: Use canonical next_clause_prefetch

  • pan/bi: Use canonical name for staging registers

  • pan/bi: Expand clause type to 5-bit

  • pan/bi: Add missing message types

  • pan/bi: Print message types as strings

  • pan/bi: Use canonical term “message type”

  • pan/bi: Use canonical term dependency

  • pan/bi: Use canonical flow control enum

  • pan/bi: Pass flow_control through directly

  • pan/bi: Handle vector moves

  • pan/bi: Expose GL 2.1 on Bifrost

  • pan/bi: Fix simple txl test

  • pan/bi: Use canonical texture op names in IR

  • pan/bi: Streamline TEXC/TEXS naming/selection

  • pan/bi: Encode skip bit into IR

  • pan/bi: Pack skip bit for texture operations

  • pan/bi: Add texture operator descriptor

  • pan/bi: Stub out TEXC handling

  • pan/bi: Add data register passing infrastructure

  • pan/bi: Handle nir_tex_src_lod

  • pan/bi: Pack TEXC

  • pan/bi: Rewrite to fit dest = src constraint

  • pan/bi: Prefer ‘texture_index’ to ‘image_index’

  • panfrost: Add missing XML for Bifrost samplers

  • panfrost: Fix Bifrost filter selection

  • panfrost: Fix Bifrost high LOD clamp

  • panfrost: Add some missing Bifrost texture XML

  • pan/bi: Implement txb

  • panfrost: Set helper_invocation_enable for Bifrost

  • pan/bi: Fix message type printing

  • pan/bi: Don’t terminate helper threads

  • panfrost: Add panfrost_block_dim helper

  • pan/bi: Use new block dimension helper

  • panfrost: Fix faults on block-based formats on Bifrost

  • pan/bi: Map NIR tex ops to Bifrost ops

  • pan/bi: Add bi_emit_lod_cube helper

  • pan/bi: Implement FETCH

  • panfrost: Update XML for Bifrost early-z/FPK

  • panfrost: Set “shader modifies coverage?” flag

  • panfrost: Temporarily disable FP16 on Bifrost

  • pan/bi: Disable mediump output lowering

  • pan/bi: Range check newc/oldc when rewriting

  • panfrost: Rename gtransfer to transfer

  • panfrost: Use canonical characterization of tls_size

  • panfrost: Drop panfrost_vt_emit_shared_memory

  • pan/mdg: Cleanup mir_rewrite_index_src_single

  • pan/bi: Drop 64-bit constant support

  • pan/bi: Fix handling of small constants in bi_lookup_constant

  • pan/bi: Stub spilling

  • pan/bi: Add no_spill flag to IR

  • pan/bi: Implement bi_choose_spill_node

  • pan/bi: Add spills/fills parameters

  • pan/bi: Add bi_spill helper

  • pan/bi: Add bi_fill

  • pan/bi: Add bi_rewrite_index_src_single helper

  • pan/bi: Add helpers for working with singletons

  • pan/bi: Implement bi_spill_register

  • pan/bi: Factor out singleton construction from scheduler

  • pan/bi: Add bi_foreach_clause_in_block_safe helper

  • pan/bi: Pack LOAD/STORE

  • pan/bi: Implement spilling

  • pan/bi: Pipe through tls_size

  • panfrost: Move nir_undef_to_zero to common util/

  • pan/bi: Use nir_undef_to_zero

  • panfrost: Record architecture major version

  • panfrost: Don’t export queries

  • panfrost: Calculate thread count on Bifrost

  • panfrost: Fix component order XML

  • panfrost: Implement BGRA textures

  • panfrost: Drop PIPE_CAP_GLSL_FEATURE_LEVEL for Bifrost

  • panfrost: Don’t advertise MSAA on Bifrost

  • pan/bi: Account for bool32 ld_ubo reads

  • panfrost: Don’t double-compose swizzles

  • panfrost: Add MALI_EXTRACT_INDEX helper

  • panfrost: Use consistent swizzle names in XML

  • panfrost: Add a blendable format table

  • panfrost: Use panfrost_blendable_formats for MFBD

  • panfrost: Use panfrost_blendable_formats for SFBD

  • panfrost: Use panfrost_blendable_formats for blending

  • panfrost: Complete format_to_bifrost_blend

  • panfrost: Remove duplicated format arg for ASTC

  • panfrost: Remove panfrost_is_z24s8_variant

  • panfrost: Add v7 special colour formats

  • panfrost: Add missing depth/stencil formats

  • panfrost: Add miscellaneous missing Midgard formats

  • panfrost: Add v7-specific depth formats

  • panfrost: Split out v6/v7 format tables

  • panfrost: Rename VARYING_DISCARD to CONSTANT

  • panfrost: Rename VARYING_POS to SNAP4

  • panfrost: Add missing 1/2/4/64-bit formats to XML

  • panfrost: Use macro for panfrost_get_default_swizzle

  • panfrost: Fix RGB5A1 formats

  • panfrost: Fix BGR233 component order

  • panfrost: Add missing alpha-first special formats

  • pan/bi: Suppress disassembly for internal shaders

  • pan/bi: Lower +CUBEFACE2

  • panfrost: Disable point sprites on Bifrost

  • panfrost: Advertise Bifrost support

  • panfrost: Drop unused swizzles

  • panfrost: Add bi_emit_array_index helper

  • pan/bi: Track tex data register swizzles

  • pan/bi: Handle 3D/array coordinates

  • pan/bi: Don’t emit TEXS for array textures

  • panfrost: Set .array_size on Bifrost

  • nir: Add SRC_TYPE to store_combined_output_pan

  • pan/mdg: Deduplicate nir_find_variable_with_driver_location

  • pan/mdg: Move writeout lowering to common panfrost

  • panfrost: Pass through src_type

  • panfrost: Deduplicate shader properties

  • pan/bi: Add +ZS_EMIT instruction to IR

  • pan/bi: Infer z/stencil flags from sources passed

  • pan/bi: Factor out bi_emit_atest

  • pan/bi: Factor out bi_emit_blend

  • pan/bi: Stub handling for nir_intrinsic_store_combined_output_pan

  • pan/bi: Emit +ZS_EMIT as needed

  • pan/bi: Lower depth/stencil stores

  • pan/bi: Correctly calculate render target index

  • pan/mdg: Add missing Collabora copyright notices

  • panfrost: Add missing Collabora copyright notices

  • pan/bi: Model writemasks correctly

Andreas Baierl (4):

  • lima/ppir: Skip instruction merge when having more than one successor

  • lima: fix glCopyTexSubImage2D

  • lima: set clear depth value to 0x00ffffff as default

  • lima/parser: Fix varyings decoding in RSW

Andres Gomez (3):

  • gitlab-ci: reuse container_post_build when building the test images

  • gitlab-ci: reorder container_post_build call for arm64_test image

  • Revert “gitlab-ci: reuse container_post_build when building the test images”

Andrew Randrianasulu (1):

  • st/va: fix build with old libva

Andrey Vostrikov (1):

  • egl/x11: Free memory allocated for reply structures on error

Andrii Simiklit (4):

  • util/xmlconfig: eliminate memory leak

  • nir: get rid of OOB dereferences in nir_lower_io_arrays_to_elements

  • glx: get rid of memory leak

  • glsl: avoid an out-of-bound access while setting up a location for variable

Anthoine Bourgeois (4):

  • docs/features: Minor update extensions support

  • docs/features: VK_KHR_mir_surface is disabled, remove it

  • docs/features: add some extensions we missed

  • docs/features.txt: VK_EXT_separate_stencil_usage not exposed on RADV

Antonio Caggiano (1):

  • zink: pre-hash gfx-pipeline-state

Anuj Phogat (2):

  • intel/gen9: Enable MSC RAW Hazard Avoidance

  • intel: Pointer to SCISSOR_RECT array should be 64B aligned

Aníbal Limón (1):

  • src/util/disk_cache_os.c: Add missing headers for open/fcntl

Arcady Goldmints-Orlov (7):

  • broadcom/compiler: support nir_intrinsic_load_sample_id

  • broadcom/compiler: Add a constant folding pass after nir_lower_io

  • broadcom/compiler: Enable PER_QUAD for UBO and SSBO loads.

  • broadcom/compiler: support varyings with struct types

  • broadcom/compiler: use nir io semantics

  • broadcom/compiler: Handle non-SSA destinations for tex instructions

  • broadcom/compiler: Allow spills of temporaries from TMU reads

Bas Nieuwenhuizen (58):

  • radv: Add ETC2 support on RAVEN2.

  • radv: Fix assert that is too strict.

  • radv: Add forcecompress debug flag.

  • radv: Do not consider layouts fast-clearable on compute queue.

  • radv: Update CI expectations for the recent descriptor indexing regressions.

  • radv: When importing an image, redo the layout based on the metadata.

  • radv: Clean up setting the surface flags.

  • radv: Use getter instead of setter to extract value.

  • driconf: Support selection by Vulkan applicationName.

  • radv: Override the uniform buffer offset alignment for World War Z.

  • radv: Fix handling of attribs 16-31.

  • radv: Remove conformance warnings with ACO.

  • radv: Update CTS version.

  • radv: Fix 3d blits.

  • radv: Centralize enabling thread trace.

  • radv: Allow triggering thread traces by file.

  • radv: Fix threading issue with submission refcounts.

  • radv: Avoid deadlock on bo_list.

  • spirv: Deal with glslang bug not setting the decoration for stores.

  • spirv: Deal with glslang not setting NonUniform on constructors.

  • radeonsi: Work around Wasteland 2 bug.

  • radv,gallium: Add driconf option to reduce advertised VRAM size.

  • amd/common: Store non-displayable DCC pitch.

  • radeonsi: Put retile map in separate buffers.

  • radeonsi: Move display dcc dirty tracking to framebuffer emission.

  • ac/surface: Fix depth import on GFX6-GFX8.

  • radv,radeonsi: Disable compression on interop depth images

  • Revert “radv: set BIG_PAGE to improve performance on GFX10.3”

  • Revert “radv: emit {CB,DB}_RMI_L2_CACHE_CONTROL at framebuffer time”

  • st/mesa: Deal with empty textures/buffers in semaphore wait/signal.

  • radv: Disable NGG on APUs.

  • radv: Simplify radv_is_hw_resolve_pipeline.

  • radv: Add VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 rendering support.

  • radv: Fix emitting SQTT userdata.

  • radv: Use correct alignment for SQTT buffer sizes.

  • radv: Fix RGP Asic CU info for GFX10+.

  • radv: Include flushes in the barrier.

  • radv: Record cache flushes for RGP.

  • radv: Write correct dispatch size for RGP.

  • radeonsi: Fix imports with displayable DCC.

  • radv: Use atomics to read query results.

  • radv: Set fce metadata correctly on DCC initialization.

  • radv: Fix event write cmdbuffer allocation when tracing.

  • radv/winsys: Expand scope of allbos lock.

  • radv: Fix mipmap extent adjustment on GFX9+.

  • aco: Add VK_KHR_shader_terminate_invocation support.

  • amd/llvm: Add VK_KHR_shader_terminate_invocation support.

  • radv: Advertise VK_KHR_shader_terminate_invocation.

  • frontends/va: Initialize drm modifier on import.

  • radv: Fix 1D compressed mipmaps on GFX9.

  • radv: Do not access set layout during vkCmdBindDescriptorSets.

  • radv: Fix variable name collision.

  • radv: Skip tiny non-visible VRAM heap.

  • radv: Fix budget calculations with large BAR.

  • radv: Fix exporting/importing multisample images.

  • radv: Fix RB+ blending for VK_FORMAT_E5B9G9R9_UFLOAT_PACK32.

  • radv: Fix a hang on CB change by adding flushes.

  • radv: Deal with unused attachments in mip flush

BillKristiansen (1):

  • compiler/glsl: Initialize local variable to zero to fix MSVC RTC error

Boris Brezillon (141):

  • spirv: Move the emit a ‘return value’ store logic into own function

  • compiler/nir: Add new flags to lower pack/unpack split instructions

  • nir: Fix i64tof32 lowering

  • spirv: Add support for the CL Round instruction

  • panfrost: Rename panfrost_create_pool() into panfrost_pool_init()

  • panfrost: Avoid accessing pan_pool fields directly

  • panfrost: Store transient BOs in a dynamic array

  • spirv: Add a vtn_get_mem_operands() helper

  • spirv: Don’t accept CPacked decoration on struct members

  • spirv: Propagate packed information to glsl_type

  • glsl: Propagate packed info in get_explicit_type_for_size_align()

  • nir/glsl: Consider block interfaces as structs when it comes to size/align calculation

  • nir: Expose the packed attribute attached to glsl_type objects

  • panfrost: gen_pack: Minor formatting improvement

  • panfrost: gen_pack: Fix __gen_unpack_uint()

  • panfrost: gen_pack: Add pan_{unpack,print}() helpers

  • panfrost: gen_pack: Move the group get_length() logic to its own method

  • panfrost: gen_pack: Add the aggregate concept

  • panfrost: gen_pack: Allow empty structs

  • panfrost: gen_pack: Add an align() modifier

  • panfrost: gen_pack: Add a log2 modifier

  • panfrost: gen_pack: Allow enum/define values expressed in hexadecimal

  • panfrost: decode: Make the indentation consistent with auto-generated print helpers

  • panfrost: decode: Rework the DUMP_{CL,ADDR}() macros

  • panfrost: decode: Add a macro to dump unpacked descriptors

  • panfrost: decode: Use pan_{unpack,print}() when applicable

  • panfrost: XML-ify the local storage descriptor

  • panfrost: Clarify what TILED mode is

  • panfrost: Add Tiled linear mode to the Block Format enum

  • panfrost: XML-ify the midgard tiler descriptor

  • panfrost: XML-ify the single target framebuffer descriptor

  • panfrost: XML-ify the bifrost tiler descriptors

  • panfrost: XML-ify the multi-target framebuffer descriptors

  • panfrost: XML-ify the job header descriptor

  • panfrost: XML-ify the write value job descriptor

  • panfrost: XML-ify the fragment job descriptor

  • panfrost: Rename the Blend dither disable flag

  • panfrost: XML-ify the compute job descriptor

  • panfrost: Avoid copying job descriptors around when we can

  • panfrost: decode: Misc formatting improvements

  • panfrost: gen_pack: Fix gnu-empty-initializer errors

  • ci: Extend meson-clang coverage by compiling all gallium drivers

  • panfrost: Fix bifrost tiler descriptor definition

  • panfrost: Fix bifrost tiler job emission

  • panfrost: Adjust quirks for bifrost v6

  • panfrost: Add preliminary support for Mali G72

  • kmsro: Add mediatek entry point

  • panfrost: Add support for rbg16 formats

  • panfrost: decode: Fix decode_bifrost_constant() prototype

  • panfrost: decode: Flag pandecode_log_typed() as PRINTFLIKE

  • panfrost: bifrost: disassemble: Fix decoding of next_regs

  • panfrost: Fix a warning

  • panfrost: Adjust the draw descriptor definition

  • panfrost: Adjust the primitive desc definition

  • panfrost: Adjust the renderer state definition

  • panfrost: Get rid of the with_opaque qualifier on the renderer state desc

  • panfrost: Drop the with_opaque specifier on midgard blend desc

  • panfrost: gen_pack: Drop support for opaque structs

  • panfrost: gen_pack: Support overlapping structs

  • panfrost: gen_pack: Add a no-direct-packing attribute

  • panfrost: Rework fixed-function blending

  • panfrost: Rework the render target layout to use overlapping structs

  • panfrost: XML-ify the blend descriptors

  • panfrost: Fix fixed-function blend on Mali v6

  • panfrost: Constify the rt_fmts arg passed to pan_lower_framebuffer()

  • panfrost: Move the blend constant mask extraction out of make_fixed_blend_mode()

  • panfrost: Pass compile arguments through a struct

  • panfrost: Allocate blit_blend with ralloc()

  • panfrost: Don’t leak NIR blend shaders

  • panfrost: Let compile_blend_shader() allocate the blend shader object

  • panfrost: Get rid of the constant patching done on blend shader binaries

  • panfrost: Move the blend shader cache at the context level

  • panfrost: Fix fixed-function blend on bifrost

  • panfrost: Extend compile_inputs to pass a blend descriptor

  • pan/bi: Copy blend shader info from compile_inputs

  • pan/bi: Use canonical name for FAU RAM sources

  • pan/bi: Get rid of the regs argument in bi_assign_fau_idx()

  • pan/bi: Rework blend descriptor access handling

  • pan/bi: Add support for load_blend_const_color_{r,g,b,a}_float

  • pan/bi: Support indirect jumps

  • panfrost: Add a “Bifrost Internal Blend” descriptor

  • panfrost: Scalarize nir_load_blend_const_color_rgba

  • panfrost: Flag blend shader function as an entry point

  • pan/bi: Add load_output support

  • pan/bi: Collect return addresses of blend calls

  • pan/bi: Special-case BLEND instruction emission for blend shaders

  • pan/bi: Reserve r0-r3 in blend shaders

  • pan/bi: Special-case load_input for blend shaders

  • panfrost: Add missing tile-buffer formats to the format enum

  • panfrost: Add blend shader support to bifrost

  • panfrost: Adjust the renderer state definition

  • panfrost: Fix tiler job injection

  • panfrost: Add the bifrost tiler internal state field

  • panfrost: Add specialized preload descriptors

  • panfrost: Replace unkown renderer state fields by their real names

  • pan/bi: Make sure we don’t print special index as a register

  • pan/bi: Print blend descriptor source properly

  • pan/bi: Add support for load_sample_id

  • pan/bi: Support the case where TEXC needs 0 or 1 staging reg

  • pan/bi: Add basic support for txf_ms

  • panfrost: Make {midgard,bifrost}_compile_shader_nir() return a program object

  • panfrost: Build blit shaders on Bifrost too

  • panfrost: Use real name for attribute’s unknown field

  • panfrost: Rename panfrost_transfer to panfrost_ptr

  • panfrost: Pass the texture payload through a panfrost_ptr

  • panfrost: Split panfrost_load_midg()

  • panfrost: Add support for native wallpapering on Bifrost

  • panfrost: Use native wallpapering on Bifrost

  • panfrost: Get rid of the non-native wallpering bits

  • panfrost: Preload primitive flags when gl_FrontFacing is accessed

  • pan/bi: Add support for load_front_face

  • pan/bi: Add support for load_point_coord

  • pan/bi: Lower {i,u}{min,max} instructions

  • pan/bi: Add ult support

  • pan/bi: Fix ms_idx type to catch missing ms_index source

  • panfrost: Leave push_constants pointer to NULL if there’s no uniform

  • panfrost: Suppress Bifrost prefetching

  • panfrost: Add array size to XML

  • panfrost: Implement v7 texture payloads

  • pan/bi: s/t0/t1/ in bi_disasm_dest_add()

  • pan/bi: Move special instruction packing to a separate helper

  • pan/bi: Split special class in two

  • pan/bi: Hook up cube instructions packing

  • pan/bi: Lower cube map coordinates

  • panfrost: Force late pixel kill when depth/stencil is written from the FS

  • panfrost: Expose GLES3 features on Bifrost when PAN_MESA_DEBUG=deqp

  • pan/bi: Extract LD_VAR sample field from ins->load_vary.interp_mode

  • pan/bi: Support centroid and sample interpolations

  • pan/bi: Fix swizzle handling in bi_copy_src()

  • pan/bi: Add support for load_ubo

  • pan/bi: Lower uniforms to UBO

  • pan/bi: Get rid of bi_emit_ld_uniform()

  • pan/bi: Move bitwise op packing out of bi_pack_fma()

  • pan/bi: Fix ARSHIFT definitions

  • pan/bi: Add support for ishr

  • pan/bi: Add support for ushr

  • panfrost: Allow linear ZS resources on Bifrost

  • pan/bi: Add support for load_vertex_id

  • pan/bi: Add support for load_instance_id

  • panfrost: Fix Bifrost blend descriptor emission

  • panfrost: Fix ->reads_frag_coord assignment

Boyuan Zhang (5):

  • vl: add flag and definition for protected playback

  • frontends/va: handle protected slice data buffer

  • radeon: add decryption params definition header

  • radeon/vcn: add defines for drm message buffer

  • radeon/vcn: program drm message buffer

Brendan Dougherty (1):

  • mesa: Fix vertex_format_to_pipe_format index.

Caio Marcelo de Oliveira Filho (19):

  • intel/compiler: Use C99 array initializers for prog_data/key sizes

  • nir: Add nir_intrinsic_terminate and nir_intrinsic_terminate_if

  • spirv: Update headers and metadata from latest Khronos commit

  • spirv: Handle SpvOpTerminateInvocation

  • intel/fs: Handle nir_intrinsic_terminate

  • vulkan: Update XML and headers to 1.2.158

  • anv: Advertise VK_KHR_shader_terminate_invocation

  • nir: Use a switch in nir_lower_explicit_io_instr

  • intel/fs: Don’t emit_uniformize when getting a constant SSBO index

  • spirv: Implement SpvCapabilitySubgroupShuffleINTEL from SPV_INTEL_subgroups

  • nir: Add nir_intrinsic_{load,store}_deref_block_intel

  • spirv: Implement SpvCapabilitySubgroupBufferBlockIOINTEL

  • intel/fs: Add A64 OWORD BLOCK opcodes

  • intel/fs: Implement nir_intrinsic_{load,store}_global_block_intel

  • intel/fs: Add surface OWORD BLOCK opcodes

  • intel/fs: Implement nir_intrinsic_{load,store}_ssbo_block_intel

  • intel/fs: Implement nir_intrinsic_{load,store}_shared_block_intel

  • compiler: Add new Vulkan shader stages

  • spirv: Add Ray Tracing execution models

Caleb Callaway (1):

  • iris: Add missing newline to debug log message

Chad Versace (2):

  • anv/image: Check DISJOINT in vkGetPhysicalDeviceImageFormatProperties2 (v2)

  • anv/image: Fix isl_surf_usage_flags for stencil images

Charmaine Lee (3):

  • st/mesa: increase size of gl_register_file bitfields

  • winsys/svga: fix display corruption after surface_init

  • svga: fix draw elements with 8-bits indices

Chia-I Wu (2):

  • virgl: move protocol headers to a common place

  • virgl: update protocol headers

Christian Gmeiner (17):

  • etnaviv: call nir_lower_bool_to_bitsize

  • etnaviv: completely turn off MSAA

  • ci: do not build libdrm for vc4, reedreno and etnaviv

  • etnaviv: call nir_opt_shrink_vectors(..) in opt loop

  • etnaviv: shuffle some variant fields

  • etnaviv: add disk cache

  • etnaviv: simplify linear stride implementation

  • ci: piglit: conditionally build OpenCL tests

  • ci/bare-metal: suppress ‘No such file or directory’

  • etnaviv: drop etna_pipe_wait(..)

  • ci/x86: speed up piglit testing

  • nir: make tgsi_varying_semantic_to_slot(..) public

  • etnaviv: convert from tgsi semantic/index to varying-slot

  • etnaviv: move etna_dump_shader(..) to generic location

  • etnaviv: move etna_destroy_shader(..) to generic location

  • etnaviv: nir: do not run opt loop after nir_lower_bool_xxx(..)

  • etnaviv/drm: fix evil-twin etna_drm_table_lock

Connor Abbott (59):

  • freedreno/afuc: Fix printing preemptleave on a5xx

  • freedreno/afuc: Handle setsecure opcode

  • freedreno/afuc: Add iret

  • freedreno/afuc: Handle xmov modifiers

  • freedreno/afuc: Make 0 a valid number

  • freedreno/afuc: Install asm/disasm

  • freedreno: Add afuc regression test

  • nir/spirv: Add the option to keep ViewIndex as an input

  • nir/lower_input_attachments: Refactor to use an options struct

  • nir/lower_input_attachments: Support loading layer id as an input

  • radv: Use an input for the layer when lowering input attachments

  • tu: Use an input for the layer when lowering input attachments

  • nir/lower_input_attachments: Support loading layer id via gl_ViewIndex

  • freedreno/a6xx: Add multiview registers

  • ir3: Add support for gl_ViewIndex in VS & FS

  • tu: Translate VkRenderPassMultiviewCreateInfo to VkRenderPassCreateInfo2

  • tu: Parse multiview render pass info

  • tu: Implement multiview clear/resolve interactions

  • tu: Improve timestamp queries

  • tu: Implement multiview query interactions

  • tu: Add multiview lowering pass

  • tu: Implement multiview pipeline state

  • tu: Enable VK_KHR_multiview

  • freedreno/computerator: Use a render node

  • tu: Expose shaderStorageImageExtendedFormats

  • tu: Expose shaderImageGatherExtended

  • ir3: Don’t use the format to get the image type

  • tu: Expose shaderStorageImage*WithoutFormat

  • nir: Add nir_lower_multiview pass

  • anv: Use nir_lower_multiview pass

  • nir: Count i/o slots correctly for per-view variables

  • nir/lower_io_arrays: Fix xfb_offset bug

  • nir: Add per_view to IO semantics

  • nir: Handle per-view io in nir_io_add_const_offset_to_base()

  • tu: Write multiview control registers in binning pass

  • tu: Refactor shader compilation flow

  • ir3, tu: Run optimization loop twice

  • ir3, tu: Link per-view position correctly

  • tu: Enable multi-position output

  • intel/nir: Use nir control flow helpers

  • radv: Use nir control flow insertion helpers

  • ttn: Use nir control flow insertion helpers

  • nir/lower_returns: Use nir control flow insertion helpers

  • nir/opt_if: Remove open-coded nir_ssa_def_rewrite_uses()

  • nir/opt_if: Use early returns in opt_if_merge()

  • ttn: Fix number of components for IF/UIF

  • nir/lower_clip_cull: Store array size for FS inputs

  • ir3: Switch tess lowering to use location

  • ir3: Handle clip+cull distances

  • tu: Implement clip/cull distances

  • freedreno/a6xx: Implement user clip/cull distances

  • freedreno: Introduce common device info struct

  • tu: Use freedreno_dev_info

  • freedreno: Use freedreno_dev_info

  • freedreno/a6xx: Update SO registers for streams

  • ir3: Support geometry streams

  • util/bitset: Add a range iterator helper

  • tu: Support geometryStreams

  • tu: Support rasterizerDiscardEnable and RasterizationStreamSelect

Daniel Abrecht (1):

  • etnaviv: Make sure to track different pipe_screens for different DRM device descriptions

Daniel Schürmann (26):

  • aco: execute branch instructions in WQM if necessary

  • nir,amd: remove trinary_minmax opcodes

  • aco/isel: refactor code and remove unnecessary v_mov

  • aco/isel: refactor emit_vop3a_instruction() to handle 2 operand instructions

  • ac/nir: implement nir_op_[un]pack_[64/32]_*

  • aco: propagate SGPRs into VOP1 instructions early.

  • aco: expand create_vector more carefully w.r.t. subdword operands

  • aco: use p_create_vector for nir_op_pack_half_2x16

  • nir/opt_algebraic: optimize unpack_half_2x16_split_x(ushr, a, 16)

  • aco: use p_split_vector for nir_op_unpack_half_*

  • aco: add validation rules for p_split_vector

  • aco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16

  • radv,aco: lower_pack_half_2x16

  • aco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10

  • aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible

  • aco: refactor GFX6_7 subdword copy lowering

  • aco: improve code sequences for 16bit packing

  • aco: use do_pack() for self-intersecting operations.

  • aco: fix GFX8 16-bit packing

  • aco: implement nir_op_unpack_[64/32]_*

  • ac/nir: implement nir_op_[un]pack_64_4x16

  • nir: add options to lower nir_op_pack_[64/32]_* via nir_lower_alu_to_scalar()

  • radv: lower pack_[64/32]_* via nir_lower_alu_to_scalar()

  • radv: remove call to nir_lower_pack()

  • aco: refactor split_store_data() to always split into evenly sized elements

  • nir/lcssa: consider loops with no back-edge invariant

Daniel Stone (17):

  • glsl/test: Don’t run whitespace tests in parallel

  • CI: Disable Panfrost T860 and AMD Stoney tests

  • CI: Skip flaky CS test on VirGL

  • CI: Skip another flaky GS test on softpipe

  • CI: Disable Panfrost T720/T760 CI

  • meson: Add MSVC narrowing-int-to-char warnings

  • CI: Windows: Use separate config file for Docker

  • CI: Re-enable VS2019 build

  • CI: Disable Windows again

  • CI: Temporarily disable Panfrost T7xx

  • CI: Re-enable Panfrost T7xx

  • CI: Disable Panfrost T7xx CI

  • CI: Re-enable Panfrost T7xx CI

  • CI: Don’t run pixmark-piano twice on radeonsi

  • CI: Only run OpenCL tests when we need to

  • CI: Disable Panfrost T760

  • freedreno: Add missing dependency to build

Danylo Piliaiev (19):

  • st/mesa: Treat vertex outputs absent in outputMapping as zero in mesa_to_tgsi

  • anv/nir: Unify inputs_read/outputs_written between geometry stages

  • spirv: Only require bare types to match when copying variables

  • ir_constant: Return zero on out-of-bounds vector accesses

  • glsl: Eliminate assigments to out-of-bounds elements of vector

  • glsl: Eliminate out-of-bounds triop_vector_insert

  • intel/disasm: Change visibility of has_uip and has_jip

  • intel/disasm: brw_label and support functions

  • intel/disasm: Label support in shader disassembly for UIP/JIP

  • intel/assembler: Add labels support

  • intel/compiler: Fix pointer arithmetic when reading shader assembly

  • st/nir: Call st_glsl_to_nir_post_opts before interface unification

  • nir/lower_io: Eliminate oob writes and return zero for oob reads

  • nir/large_constants: Eliminate out-of-bounds writes to large constants

  • nir/lower_samplers: Clamp out-of-bounds access to array of samplers

  • intel/fs: Disable sample mask predication for scratch stores

  • docs: add INTEL_SHADER_ASM_READ_PATH description

  • nir/lower_returns: Append missing phis’ sources after “break” insertion

  • freedreno/a6xx: Fix typo in height alignment calculation in a6xx layout

Dave Airlie (115):

  • anv: add no reloc flags on empty and simple bo paths.

  • CI: temp disable t720/t760 jobs.

  • llvmpipe: only read 0 for channels being read

  • gallium: add an interface for memory allocations.

  • gallium: add a resource flag to say no over allocation.

  • llvmpipe: add support for memory allocation APIs

  • gallivm/nir: add load push constant support

  • util/format: add some ZS helpers for vallium

  • vulkan/wsi: add sw support. (v2)

  • vallium: initial import of the vulkan frontend

  • llvmpipe/blit: for 32-bit unorm depth blits just copy 32-bit

  • llvmpipe: enable GL 4.5

  • vallium: fix input attachment lowering variable shadowing

  • llvmpipe/cs: update compute counters not fragment shader.

  • gallium/nir/tgsi: reindent some code in the nir->tgsi info (v2)

  • gallivm/nir: add imod support

  • gallivm/sample: fix lod query on array textures.

  • llvmpipe: lower uadd_carry/usub_borrow.

  • gallium/nir/tgsi: add support for compact variables

  • gallivm/nir: fixup compact TCS variable storage.

  • gallivm/nir: split tex/sampler indirect offsets

  • llvmpipe: lower cs local index from id

  • llvmpipe: lower mul 2x32_64

  • llvmpipe/nir: lower some more packing at nir level.

  • llvmpipe: add reference counting to fragment shaders.

  • vallium: handle 3D image views properly.

  • vallium: limit buffer allocations to gallium max.

  • gallium/nir/tgsi: fix nir->tgsi info conversion for samplers/image

  • gallivm/nir: lower tg4 offsets.

  • gallivm/nir: add indirect swizzle output loading support

  • gallivm/nir: add quantize to f16 support

  • gallivm/nir: fix const compact

  • gallivm/nir: lower frexp/ldexp

  • gallivm/nir: add subpass sampler type support

  • gallivm: use common code to select texel type

  • llvmpipe: blend has effects even if no colorbuffers.

  • llvmpipe: add array/3d clearing support

  • llvmpipe/fs: multisample depth/stencil bad ir generated

  • gallivm/nir: allow 64-bit arit ops

  • gallivm/nir: add some f16 support

  • vallium: disable VK_KHR_shader_float16_int8.

  • vulkan/device_select: don’t pick a cpu driver as the default

  • llvmpipe: include gallivm perf flags in shader cache.

  • gallivm: disable brilinear for lod bias and explicit lod.

  • vtn: add an option to create a nir library from spirv

  • clover/nir: add libclc lowering pass

  • util: add missing extern C

  • clover: handle libclc shader (v3)

  • gallivm: fix pow(0, y) to be 0

  • gallivm: fix 64-bit CL intrinsics.

  • gallivm/nir: fix up non 32-bit load stores

  • gallivm/nir: handle non-32-bit mul high

  • llvmpipe: use an alternate env var to enable clover.

  • lavapipe: rename vallium to lavapipe

  • gallivm/nir: make sure to mask global reads.

  • llvmpipe/cs: add in shader shared size.

  • gallivm/nir: fix non-32 bit find lsb/msb

  • lavapipe: drop dri,dricommon deps.

  • ci: move to using clang 10 for meson + clover

  • clover: Use core libclc loader

  • ci: enable piglit testing of clover/llvmpipe.

  • clover: don’t call validate spirv at all if not needed

  • ci: fix deqp clone + fetch

  • CI: build our own spirv tools

  • clover/nir: add a constant folding pass before lowering mem const

  • llvmpipe: fix sampler/image binding for clover.

  • gallivm: add load/store scratch support.

  • llvmpipe: fix 8/16 bit global stores

  • gallivm: fix 64->16 f2f16

  • gallivm: add 16-bit split/merge support.

  • gallivm: add b2i8/b216 support

  • gallivm: handle sub-32 bit masked stores.

  • gallivm: add support for 8/16-bit mul_hi

  • gallivm: get correct min/max behaviour for kernels.

  • gallivm: lower flrp for all sizes.

  • CI: remove llvmpipe cl flake test

  • gallivm: zero init the temporary register storage.

  • gallium: add a level parameter to resource parameter get

  • gallium: add a layer stride pipe resource parameter.

  • llvmpipe: add resource get param support.

  • lavapipe: use resource get param.

  • gallivm: fix f16 quantize.

  • lavapipe: don’t write to pending clear aspects in cmd buffer

  • lavapipe: constify state pointers into command buffers.

  • lavapipe: fix dEQP-VK.info.device_properties

  • gallivm/nir: handle dvec3/4 inputs properly.

  • gallivm/nir: fix vulkan vertex inputs

  • lavapipe: fix 3d compressed texture copies.

  • lavapipe: stop crashes with 3D z blits

  • llvmpipe: add clear_buffer callback. (v2)

  • lavapipe: use clear_buffer callback

  • lavapipe: don’t advertise linear filtering on integer textures.

  • gallium: add a non-multisample sample mask out behaviour flag.

  • llvmpipe: respect the sample mask in non-multisample flag

  • lavapipe: request correct sample mask behaviour

  • CL: update CL headers to 3.0

  • vtn/opencl: add ctz support

  • clover: access 3.0 and deprecated 2.2 API

  • clover/llvm: add 3.0 versioning.

  • clover/spirv: hook up spir-v environment for 3.0

  • clover: add empty cl 3.0 dispatch entries.

  • gallium: handle empty cbuf slots in framebuffer samples helper

  • u_blitter: port radv 3D blit coords logic.

  • lavapipe: enable alpha to one.

  • lavapipe: disable SNORM blending for now

  • llvmpipe: just use draw_regions in draw/line setup.

  • draw: fix tess eval pipeline statistics.

  • lavapipe: fixup device allocate + enable private data

  • lavapipe: fix wsi acquire fences

  • llvmpipe/setup: move point stats collection earlier.

  • llvmpipe: fix multisample point rendering.

  • llvmpipe: fix multisample lines.

  • lavapipe: fixup mipmap precsion bits

  • lavapipe: enable pipeline stats queries

  • gallium: fix missing bit field in p_state.h

Denis Pauk (1):

  • mesa: bptc fixes for decompress rgba_unorm and rgb_float

Duncan Hopkins (10):

  • meson: Add xcb-fixes to loader when using x11 and dri3. Fixes undefined symbol for xcb_xfixes_create_region in loader_dri3_helper.c

  • zink: clamped maxPerStageDescriptorUniformBuffers limits to INT_MAX when stored as uint32_t.

  • zink: Basic framework to check for optional instance layers and instance extensions.

  • zink: Added support for MacOS MoltenVK APIs.

  • zink: return fail if create_instance fails

  • zink: Added inbuilt debug logging from the VK_LAYER_LUNARG_standard_validation layer.

  • zink: add support to device info for macro guards and just VkPhysicalDevice*Features with out the have_.

  • zink: have_triangle_fans support.

  • zink: For MoltenVk added vkFlushMappedMemoryRanges() to vkMapMemory() to fix empty mapped memory.

  • zink: make physical device functions use a dynamic function pointers.

Dylan Baker (31):

  • Bump development version and clear new_features

  • meson/freedreno: Fix lua requirement

  • docs: update calendar for 20.2.0-rc1

  • docs: update calendar for 20.2.0-rc2

  • meson/anv: Use variable that checks for –build-id

  • glsl/xxd.py: fix imports

  • clover/meson: use dep.get_variable instead of deprecated get_pkgconfig_variable

  • meson: generalize libclc usage

  • docs: update calendar and link releases notes for 20.2

  • docs/release-calender: Add 20.2 stable releases

  • docs: add release notes for 20.2.0

  • docs: Add sh256 sums for 20.2.0

  • docs: add release notes for 20.2.1

  • docs: add SHA256 sums for 20.2.1

  • docs: update calendar and link releases notes for 20.2.1

  • docs: add release notes for 20.2.2

  • dcs: Add sha256 sums for 20.2.2

  • docs: update calendar and link releases notes for 20.2.2

  • bump version for 20.3-rc1

  • .pick_status.json: Update to bf5cea7232f9ee2934c212211ebefb6fe766526d

  • .pick_status.json: Update to 87dc3106b077199b829a082e32ec33d0c6d400ab

  • .pick_status.json: Update to bac6cc586fe4c1b24351e0574d3a961eb631f6ae

  • bump VERSION for 20.3.0-rc2 release

  • .pick_status.json: Update to a59b1b18a95af1f8edb0093baf508e974e3251a2

  • .pick_status.json: Update to a92f597b98bb032b904c7c8a8c3a9fe798b51915

  • .pick_status.json: Update to 9fa1cdfe7ffd9e7ebd83055e2008f3e4b8ada549

  • meson: Don’t add extra values to shader-cache

  • appveyor: disable for now

  • bump VERSION for 20.3-rc3

  • .pick_status.json: Update to 89f6b72f19dbc503386643c6283047bdb1013bef

  • .pick_status.json: Update to d3c67d7e7ec6b9cf10fbea0d08e92751b7b0fbae

Eduardo Lima Mitev (9):

  • st: Pass TextureTiling option from texture to memory obj

  • freedreno: Implement memory object create/destroy for GL_EXT_memory_object

  • freedreno: Refactor fd_resource_create_with_modifiers() into a helper

  • freedreno/layout: Move hard-coded minimum width for UBWC to a macro

  • freedreno: implement pipe screen’s resource_from_memobj

  • freedreno: Implement pipe screen’s get_device/driver_uuid()

  • freedreno: Enable GL_EXT_memory_object and GL_EXT_memory_object_fd

  • freedreno: Destroy syncobj too when destroying fence

  • turnip: Enable support for KHR_incremental_present

Eleni Maria Stea (3):

  • radeonsi: support for external buffers (ext_external_objects)

  • iris: handle PIPE_FD_TYPE_SYNCOBJ type

  • iris: add support for fence signal capability

Emil Velikov (1):

  • radv: restrict exported symbols with static llvm

Emmanuel Vadot (1):

  • util/os_misc: os_get_available_system_memory() for FreeBSD

Eric Anholt (221):

  • util: Split the pack/unpack functions out of the format desc.

  • util: Change a codegenned switch statement to a nice little table.

  • util: Fix up indentation in the generated format tables code.

  • uitl: Add R1_UNORM to the list of noaccess (no pack/unpack) formats.

  • util: Make all 3 fetch_rgba functions occupy the same function slot.

  • util: Mark the format description getter functions as const.

  • util: Move fetch_rgba to a separate function table.

  • gallium: Use unpack_rgba() instead of fetch_rgba in translate_generic

  • freedreno/ir3: Fix compiler warning from the setjmp fails path.

  • freedreno/cffdec: When .mergedregs is set, don’t count half regs.

  • freedreno/ir3: Fix assertion failures dumping CS high full regs.

  • util: Expose rgba unpack/fetch functions as external functions as well.

  • util: Explicitly call the unpack functions from inside bptc pack/unpack.

  • radv: Move nir_opt_shrink_vectors() into the opt loop.

  • nir/opt_undef: Handle a couple more normal store intrinsics.

  • nir: Expand opt_undef to handle undef channels in a store intrinsic.

  • nir: Shrink store intrinsic num_components to the size used by the writemask.

  • ci/deqp-runner: Drop stale comment from deqp-runner.sh.

  • ci/deqp-runner: Drop unused “count” variable

  • ci/deqp-runner: Add a post-deqp-run filter list for known flakes.

  • ci/freedreno: Move our skips lists over to being known-flakes lists.

  • ci/freedreno: List more common flakes reported recently.

  • ci/bare-metal: Use a new serial buffer tool.

  • ci/bare-metal: Convert the main cros-servo boot code to python

  • ci/bare-metal: Retry booting chezas instead of failing when !POWER_GOOD

  • ci/bare-metal: Try rebooting chezas again if they get stuck during tftp.

  • nir: Make the nir_builder *_imm helpers consistently handle bit size.

  • nir: Add nir_[iu]shr_imm and nir_udiv_imm helpers and use them.

  • nir: Add a lowering pass for backends wanting load_ubo with vec4 offsets.

  • freedreno/ir3: Replace our custom vec4 UBO intrinsic with the shared lowering.

  • nir/load_store_vectorizer: Clean up unit test swizzle assertions.

  • freedreno: Drop UNIFORM_BUFFER_OFFSET_ALIGNMENT to 32

  • ci: Mark the rest of compswap as flaky on freedreno.

  • freedreno/a5xx: Don’t set the VARYING flag for fragcoord-only programs.

  • ci: Test the KHR-GL* CTS cases with softpipe.

  • nir/opt_copy_prop_vars: Quiet valgrind warning about overlapping memcpy.

  • nir: Add a helper for general instruction-modifying passes.

  • nir/lower_vec_to_movs: Convert to use nir_shader_instructions_pass().

  • nir/opt_undef: Convert to use nir_shader_instructions_pass().

  • nir/lower_io_to_scalar: Convert to use nir_shader_instructions_pass().

  • nir/nir_lower_wrmasks: Use the nir_lower_instructions_pass() helper.

  • nir/lower_discard_to_demote: Use nir_shader_instructions_pass().

  • drm-shim: Fix unused variable warnings from asserts in release build.

  • panfrost: Fix OOB array access compiler warning.

  • panfrost: Fix remaining release-build warnings.

  • gallium/tests: Fix compiler warning about unused vars in trivial tests.

  • nvc0: Fix compiler warning about unused var that gets asserted.

  • vc4: Fix unused var warnings in release builds from assertions.

  • nv50: Fix uninitialized var warnings from using assert() as unreachable().

  • zink: Fix unused var warnings in release build from assertions.

  • etnaviv: Fix unused var warning in release build from assertions.

  • lima: Fix unused var/function warnings in release build from assertions.

  • lima: Fix uninitialized var warning from using assert() as unreachable().

  • virgl: Fix unused var warnings in release build from assertions.

  • ci: Add a release build with -Werror enabled.

  • nir: Fix printing of individual instructions with io semantics.

  • nir: Look up the shader when printing a single instruction.

  • ci: Make a missing device name correctly bail out of deqp-runner.sh.

  • turnip: Make sure we include the build id.

  • pipe-loader: Use real galliumvl if radeonsi is being linked.

  • ci: Switch to using gold as the linker.

  • nir: Invalidate live SSA def information when making new SSA defs.

  • nir: Switch the indexing of block->live_in/out arrays.

  • ci: Bump vulkan CTS version to 1.2.3.2, and keep the GL CTS around.

  • ci: Use the same VK-GL-CTS tree for GL/GLES as VK.

  • ci: Enable KHR-GL30 CTS testing on freedreno a630.

  • freedreno/a6xx: Add ARB_depth_clamp and separate clamp support.

  • gallivm: Report the unsupported intrinsic instead of just assert(0);

  • gallium/tgsi: Add support for PRIMITIVEID as a system value.

  • gallium/tgsi: Add some missing opcodes to tgsi_ureg.

  • gallium/tgsi: Add a helper for initializing ureg from a shader_info.

  • gallium/ureg: Set the next shader stage from the shader info.

  • nir: Add simplistic lowering for bany_equal/ball_inequal.

  • nir/opt_vectorize: Add a callback for filtering of vectorizing.

  • gallium/tgsi_exec: Add missing DFLR opcode support.

  • gallium/tgsi_exec: Fix up NumOutputs counting

  • ci/bare-metal: Use re.search() instead re.match() for our line matching.

  • ci/bare-metal: Fix detection of “POWER_GOOD not seen in time” fails

  • ci/bare-metal: Include a timestamp in our serial reads.

  • ci/bare-metal: Log why our run restarts when it does.

  • ci/bare-metal: Fix capturing of serial output as job artifacts.

  • ci/bare-metal: Use python for handling fastboot booting and parsing

  • nir/load_store_vectorizer: Use more imm helpers in the tests.

  • nir/load_store_vectorizer: Add unit tests for alignment handling.

  • nir: Update the comment about nir_lower_uniforms_to_ubo()’s multiplier.

  • nir: Add a range_base+range to nir_intrinsic_load_ubo().

  • freedreno/ir3: Use the new NIR UBO ranges in UBO analysis.

  • freedreno/ir3: Apply the max upload limit to initial range setup

  • nir: Use explicit deref information to provide real UBO ranges.

  • iris: Add missing range_base/range to our nir_load_ubos.

  • turnip: Fix a compiler warning in release builds of the query code.

  • freedreno: Make the pack struct have a .qword for wide addresses.

  • turnip: Fix truncation of CS shader iovas to 32 bits.

  • turnip: Fix truncation of iovas to 32 bits in queries.

  • ci/bare-metal: Update the kernel to msm-next-pgtables

  • ci/bare-metal: Allow wget of the kernel/dtb for kernel development.

  • freedreno: Add another new sysmem flake.

  • freedreno/cffdec: Fix up texturator parsing scripts for XML changes.

  • freedreno/cffdec: Add support for texturator’s 2DMS layout setup.

  • freedreno/fdl: Add layout test for the Android CTS’s MSAA mustpass surface.

  • turnip: Add support for a615.

  • turnip/kgsl: Associate fences with submits.

  • mesa: Make the android_stub be a set of non-installed shared libraries.

  • android: Disable trying to read/write to the disk cache.

  • gallium/drm: Deduplicate screen creation for the dynamic (clover) pipe loader.

  • gallium/drm: Refactor the stub screen create functions.

  • gallium/drm: Define the DRM entrypoints in drm_helper.h

  • gallium/drm: Make the pipe loader handle the driconf merging.

  • util/xmlconfig: Add a unit test of the code.

  • virgl: Clean up the driconf definition of GLES_SAMPLES_PASSED_VALUE.

  • driconf: Use nesting macros for defining options.

  • mesa: Promote Intel’s simple logging façade for Android to util/

  • turnip: Replace tu_log*() with mesa_log*()

  • ci/freedreno: Sort the traces in the .yml of expectations

  • ci/freedreno: Add trace tests for glxgears, 0 A.D., and xonotic.

  • nir/lower_clip: Add i/o semantics for load/store intrinsics.

  • intel: Add support for i945g to intel_stub_gpu.

  • freedreno/ir3: Make sure we run the opt loop after lowering UBOs to vec4.

  • nir: Document a bit about how align_mul/offset work.

  • nir: Print the alignment information on casts.

  • nir/nir_lower_uniforms_to_ubo: Set better alignments on our new instructions.

  • nir/gl_nir_lower_buffers: Set up align_mul/offset on UBOs.

  • nir: Make the load_store_vectorizer provide align_mul + align_offset.

  • nir: Drop the high_offset argument to the load_store_vectorizer filter.

  • nir: Make nir_lower_ubo_vec4() handle non-vec4-aligned loads.

  • freedreno/ir3: Enable the i/o vectorizer on UBOs.

  • ci/bare-metal: Move the “POWER_GOOD not seen in time” check to the right time.

  • driconf: Eliminate the DRI_CONF_OPT_BEGIN_B macro.

  • driconf: Fix extra quoting on “Jimenez’”.

  • r200: Reuse DRI_CONF_OPT_F for texture_blend_quality.

  • driconf: Make a DRI_CONF_OPT_S() for string options.

  • util/xmlconfig: Drop silly open-coded strdup.

  • util/xmlconfig: Indent to Mesa style.

  • driconf: Delete disjoint range support.

  • driconf: Use DRI_CONF_OPT_I for remaining int options

  • driconf: Make the driver’s declarations be structs instead of XML.

  • driconf: Stop quoting true/false in boolean option definitions.

  • util/xmlconfig: Drop use of XML_Char in parsing.

  • android: Disable the user XML config parsing.

  • turnip: Don’t expose VK_ANDROID_native_buffer on non-Android.

  • turnip: Use mesa’s normal PRINTFLIKE macro instead of our own.

  • turnip: Mark the vk_errorf helper as bring printflike.

  • turnip: Extend the coverage of TU_DEBUG=startup.

  • turnip: Always enable TU_DEBUG=startup on debug drivers.

  • turnip: Report device loss through _mesa_loge() instead of fprintf.

  • turnip/kgsl: Add strerror decode in BO init failure.

  • driconf: Make sure that the range check on the defaults actually works.

  • driconf: Restore the ability to override driconf with the environment.

  • ci/softpipe: Add another flaky GS test to the skips list.

  • freedreno/ir3: Clean up the UBO upload plan setup.

  • freedreno/ir3: Don’t leave holes the UBO upload plan.

  • turnip/kgsl: Fix last minute breakage of the build.

  • turnip/kgsl: Add support for importing dma-bufs.

  • turnip: Detect Qualcomm gralloc and its UBWC flag on gralloc surfaces.

  • turnip: Add support for GetSwapchainGrallocUsage2ANDROID().

  • meson: Drop adding -Wl,–gc-sections to project c/cpp arguments.

  • glsl/tests: Make the tests skip on Android binary execution failures.

  • symbols-check: Add __cxa_guard_* to the list of approved symbols.

  • ci/android: Switch to using the Android NDK.

  • docs: Document how to replicate a CI build locally.

  • android_stub: Update platform headers to include gralloc1.h.

  • ci/android: Switch build to using platform SDK version 26.

  • util: Import a copy of drm’s libsync.h

  • android: Add pre-4.7 Android kernel compatibility to our libsync header.

  • turnip: Drop a dead error checking path in device init.

  • turnip: Use Mesa’s libsync.h instead of libdrm’s libsync.h.

  • turnip: Don’t link the WSI code if we don’t have a WSI extension.

  • turnip: Only link libdrm in the DRM case, not KGSL.

  • ci: Enable NIR_VALIDATE everywhere.

  • nir: Introduce nir_metadata_instr_index for nir_index_instr() being current.

  • nir: Replace nir_ssa_def->live_index with nir_instr->index.

  • nir: Add a block start/end ip to live instr index metadata.

  • nir: Add a call to get a struct describing SSA liveness per instruction.

  • nir: Add an option to not lower source mods for f64/u64/i64.

  • gallium: Add a nir-to-TGSI pass.

  • softpipe: Fix buffer overflows in SSBO atomics.

  • softpipe: Switch to using NIR as the shader format from mesa/st.

  • meson: Only require libexpat when a part of the build needs it.

  • freedreno: Use Android’s libsync instead of libdrm’s.

  • meson: Don’t try to build GLX by default on Android.

  • meson: Don’t enable libunwind by in ‘auto’ mode on Android.

  • docs: Document how to build and install Android drivers.

  • freedreno/cffdec: Fix format overflow warning.

  • freedreno/tools: Fix compiler warnings about using sz in the error paths.

  • freedreno/fdperf: Silence a compiler warning about current counter.

  • turnip: Handle some error paths in allocating CS space from a command buffer.

  • turnip: Handle the error path for tu/drm’s vkResetFences().

  • turnip: Add error path handling for descriptor pool init.

  • ci: Enable Werror on meson-arm64-build-test.

  • gallium/ntt: Add default compiler options for non-native-NIR drivers.

  • st/mesa: Drop the TGSI paths for PBOs and use nir-to-tgsi if needed.

  • st/mesa: Drop the TGSI paths for drawpixels and use nir-to-tgsi if needed.

  • nir: Only validate in passes that might have changed things.

  • docs: Move the gallium driver documentation to the top level.

  • docs/vmware: Move the vmware driver docs into the drivers section.

  • docs/vc4: Move my old vc4 wiki’s documentation into docs.mesa3d.org.

  • docs/vc4: Add information on the hw documentation available.

  • docs/v3d: Add a little stub of v3d documentation.

  • docs: Drop extra link to old DRI wiki in the “Help” section.

  • docs: Add a link to the linux kernel DRM docs under “Developer Topics”

  • docs: Fix “Hosted by” link and drop duplicate.

  • ci: Add the new timeout-prone softpipe-gl test to the skips list.

  • mesa/st: Fix a use-after-free of the NIR shader stage.

  • st/nir: Fix the st->pbo.use_gs case.

  • st/nir: Drop setting interp mode on system values in builtins.

  • tu: Make sure spirv_to_nir knows we support imageStorageWithoutFormat.

  • turnip: Fix image size for 3D vkGetImageSubresourceLayout.

  • ci/bare-metal: Apply autopep8 to the bare-metal scripts.

  • ci/bare-metal: Reset colors at the end of a line of serial output.

  • ci/deqp: Switch to a new dEQP runner written in Rust.

  • util/set: Fix the _mesa_set_clear function to not leave tombstones.

  • ci: Only install kernel modules for LAVA devices.

  • gallium/draw: Fix rasterizer_discard for wide points/lines.

  • freedreno: Fix leak of shader binary on disk cache hits.

  • freedreno: Fix warning about uninit size for the size==0 special case.

  • gallium: Fix leak of the merged driconf options.

  • freedreno: Fix leak of u_transfer_helper.

  • gallium: Fix leak of bound SSBOs at CSO context destruction.

  • gallivm: Fix max const buffer count.

  • gallium: Fix leak of currently bound UBOs at CSO context destruction.

  • freedreno: Break out of “should we free the entry” loop once we’ve freed.

Eric Engestrom (94):

  • pick-ui: specify git commands in “resolve cherry pick” message

  • egl/entrypoint-check: split sort-check into a function

  • egl/entrypoint-check: add check that GLVND and plain EGL have the same entrypoints

  • driconf: fix force_gl_vendor description

  • meson: bump required glvnd version

  • egl: replace _EGLDriver param with _EGLDisplay->Driver in _eglReleaseDisplayResources()

  • egl: replace _EGLDriver param with _EGLDisplay->Driver in dri{2_x11,3}_create_window_surface()

  • egl: replace _EGLDriver with _EGLDisplay->Driver in _eglQuerySurface()

  • egl: drop unused _EGLDriver from Initialize()

  • egl: drop unused _EGLDriver from Terminate()

  • egl: drop unused _EGLDriver from {Create,Destroy}Context()

  • egl: drop unused _EGLDriver from Create{Window,Pixmap,Pbuffer}Surface() & DestroySurface()

  • egl: drop unused _EGLDriver from MakeCurrent()

  • egl: drop unused _EGLDriver from QuerySurface()

  • egl: drop unused _EGLDriver from {Bind,Release}TexImage()

  • egl: drop unused _EGLDriver from SwapInterval()

  • egl: drop unused _EGLDriver from SwapBuffers{,WithDamageEXT,RegionNOK}()

  • egl: drop unused _EGLDriver from CopyBuffers()

  • egl: drop unused _EGLDriver from SetDamageRegion()

  • egl: drop unused _EGLDriver from WaitClient()

  • egl: drop unused _EGLDriver & _EGLDisplay from WaitNative()

  • egl: drop unused _EGLDriver from GetProcAddress()

  • egl: drop unused _EGLDriver from {Create,Destroy}ImageKHR()

  • egl: drop unused _EGLDriver from {Create,Destroy,ClientWait,Wait,Signal}SyncKHR()

  • egl: drop unused _EGLDriver from DupNativeFenceFDANDROID()

  • egl: drop unused _EGLDriver from {Create,Export}DRMImageMESA()

  • egl: drop unused _EGLDriver from {Bind,Unbind,Query}WaylandDisplayWL()

  • egl: drop unused _EGLDriver from CreateWaylandBufferFromImageWL()

  • egl: drop unused _EGLDriver from PostSubBufferNV()

  • egl: drop unused _EGLDriver from QueryBufferAge()

  • egl: drop unused _EGLDriver from ExportDMABUFImage{,Query}MESA()

  • egl: drop unused _EGLDriver from QueryDmaBuf{Formats,Modifiers}EXT()

  • egl: drop unused _EGLDriver from SetBlobCacheFuncsANDROID()

  • egl: drop unused _EGLDriver from _eglGetConfigs()/_eglChooseConfig()/_eglGetConfigAttrib()

  • egl: drop unused _EGLDisplay from _eglSetDamageRegionKHRClampRects()

  • egl: drop unused _EGLDriver & _EGLDisplay from _eglQueryContext()

  • egl: drop unused _EGLDriver from _eglSurfaceAttrib()

  • egl: replace _EGLDriver with _EGLDisplay->Driver in _eglGetSyncAttrib()

  • egl: replace replace _EGLDriver with _EGLDisplay->Driver in eglapi.c

  • egl: drop unused _EGLDriver from MesaGLInteropEGL{QueryDeviceInfo,ExportObject}()

  • egl: replace `&_eglDriver`/`NULL` tested against `NULL` with simple `true`/`false`

  • egl: drop unused ${drv}_driver()

  • egl: inline _eglGetDriverProc() into eglGetProcAddress()

  • egl: inline _eglInitializeDisplay() into eglInitialize()

  • egl: drop now empty egldriver.c

  • egl: drop unused egldriver.h header

  • meson: fix trivial s/dir/dri/ typo

  • egl/x11_dri3: enable & require xfixes 2.0

  • egl/x11_dri3: implement EGL_KHR_swap_buffers_with_damage

  • docs: add release notes for 20.1.6

  • docs: update calendar and link releases notes for 20.1.6

  • gitlab-ci: fix testing whether a variable with a given name is set or not

  • gitlab-ci: fix quoting of variables passed down to bare-metal runners

  • egl: drop an indentation level in _eglFindDisplay() by replacing break/if with a goto

  • egl: drop another indentation level in _eglFindDisplay() by inverting an if

  • egl: drop invalid shebang

  • scons: bump c++ standard to 14 to match meson

  • docs/egl: fix typo

  • docs/egl: move section around

  • docs/egl: complete list of dri2 platforms

  • docs/egl: add haiku driver

  • docs/egl: add some more documentation

  • docs/egl: correct/update DRI2 mention with the shiny new DRI3

  • egl: move extension driver functions after core functions

  • egl: document which driver hooks are only required by extensions

  • egl: inline eglSwapInterval() fallback

  • egl: simplify eglSwapInterval() fallback logic

  • meson: don’t advertise TLS support if glx wasn’t build with it

  • egl/android: simplify dri2_initialize_android()

  • egl/surfaceless: simplify dri2_initialize_surfaceless()

  • egl/wayland: simplify dri2_initialize_wayland()

  • egl/x11: simplify dri2_initialize_x11()

  • docs: add release notes for 20.1.7

  • docs: update calendar and link releases notes for 20.1.7

  • docs: shift 20.2 rc dates by two weeks to match reality

  • meson: drop leftover PTHREAD_SETAFFINITY_IN_NP_HEADER

  • docs/download: mention tarball GPG signatures and link to the keys

  • docs: add another 20.1.x release to allow for more overlap with 20.2

  • docs/release-calendar: update 20.2

  • docs: add release notes for 20.1.8

  • docs: update calendar and link releases notes for 20.1.8

  • bin/gen_release_notes.py: escape special rST characters

  • docs: add release notes for 20.1.9

  • docs: update calendar and link releases notes for 20.1.9

  • add one last 20.1 release to coincide with expected 20.2.1

  • radv: add missing u_atomic.h include

  • docs: fix relnotes index

  • docs: fix release calendar

  • docs: fix 20.2.0 relnotes

  • docs: add release notes for 20.1.10

  • docs: update calendar and link releases notes for 20.1.10

  • docs/release-calendar: plan 20.3 release

  • gitlab-ci: drop deprecated platforms that snuck in when nobody was watching

  • meson: drop deprecated EGL platform build options

Erico Nunes (4):

  • lima: dont split vec3 unaligned load inputs

  • lima: allocate new bo for stream draw

  • lima: fix vertex shader uniform buffer size

  • lima: add natively supported vertex buffer formats

Erik Faye-Lund (123):

  • st/wgl: do not reject PFD_SUPPORT_GDI

  • gallium/util: factor out primitive-restart rewriting logic

  • gallium/indices: don’t expand prim-type for 8-bit indices

  • gallium/indices: generalize primitive-restart logic

  • gallium/indices: implement prim-restart for line-loops

  • gallium/indices: use prim_restart-helper for polygon

  • gallium/indices: implement prim-restart for triangle fans

  • gallium/indices: introduce u_primconvert_config

  • gallium/indices: translate primitive-restart values

  • compiler/nir: make lowering global-id to local-id optional

  • nir: add iabs-lowering code

  • gallium/util: use uint sampler for stencil-reads

  • nir: fix const-cast warning on MSVC

  • v3d: remove unused header

  • vc4: remove unused header

  • gallium/aux: remove unused u_blit.[ch]

  • gallium/util: add shader for stencil-blits

  • gallium/util: add blitter-support for stencil-fallback

  • mesa: handle GL_FRONT after translating to it

  • zink: correct typo in stencil-setup

  • zink: store base-object of DSA-state

  • zink: only set stencil-ref for back if two-sided

  • docs: escape backquote character

  • docs: show ‘Edit on GitLab’-link

  • docs: store prefixes in redirects

  • docs: remove webmaster article

  • docs: everytime -> every time

  • docs: apis -> APIs

  • docs: scons -> SCons

  • docs: frambuffer -> framebuffer

  • docs: make two acronyms upper-case

  • docs: unecessarily -> unnecessarily

  • docs: behaviour -> behavior

  • docs: timeplan -> time plan

  • docs: initialisation -> initialization

  • docs: gitlab -> GitLab

  • docs: url -> URL

  • docs: recognisable -> recognizable

  • docs: drop outdated gallium-docs comment

  • docs: clippping -> clipping

  • docs: consistantly -> consistently

  • docs: stabilisation -> stabilization

  • docs: flavours -> flavors

  • docs: debian -> Debian

  • docs: docker -> Docker

  • docs: gallium -> Gallium

  • st/mesa: use roundf instead of floorf for lod-bias rounding

  • gallium/util: set right dst-dimensions

  • gallium/util: fix texture-coordinates for stencil-fallback

  • gallium/util: allow scaling blits for stencil-fallback

  • docs: softwara -> software

  • docs: existant -> existent

  • docs: webservice -> web service

  • docs: bpp -> BPP

  • docs: llvm -> LLVM

  • docs: correct reference to meson.build

  • docs: meson -> Meson

  • docs: python3 -> Python 3

  • docs: flex -> Flex

  • docs: bison -> Bison

  • docs: mako -> Mako

  • docs: chocolatey -> Chocolatey

  • docs: ninja -> Ninja

  • docs: mingw -> MinGW

  • docs: microsoft -> Microsoft

  • docs: linux -> Linux

  • docs: windows -> Windows

  • docs: visual studio -> Visual Studio

  • docs: gpu -> GPU

  • docs: cmake -> CMake

  • docs: x11 -> X11

  • docs: wayland -> Wayland

  • docs: drm -> DRM

  • docs: android -> Android

  • docs: git -> Git

  • docs: quote “git log”

  • docs: scons -> SCons

  • docs: ubuntu -> Ubuntu

  • docs: vmware -> VMWare

  • docs: Sandybridge -> Sandy Bridge

  • docs: cpu -> CPU

  • gallium/util: fix memory-leak

  • gallium/util: allow scissored blits for stencil-fallback

  • zink: use nir_lower_ubo_vec4 to simplify things a bit

  • zink: support non-const offsets

  • zink: support loading any UBO

  • zink: do not report SSBOs as halfway supported

  • zink: add feature-documentation

  • zink: reject resource-imports with modifiers

  • v3d: do not report alpha-test as supported

  • vc4: do not report alpha-test as supported

  • nir: drop support for using load_alpha_ref_float

  • nir: drop unused alpha_ref_float

  • docs: create leading directories for redirects

  • docs: verify that targets for relative redirects exist

  • docs: specify redirects relative to docs-root

  • docs: specify redirects in conf.py instead

  • zink: verify geometry shader feature

  • docs: do not document required minimum

  • docs: document zink’s gl \> 3.0 requirements

  • mapi: remove unused function

  • mapi: do not call thread-unsafe dispatch getter

  • mapi: do not return thread-specific data for wrong thread

  • docs: add link to extension spec

  • docs: ie. -> i.e.

  • docs: eg. -> e.g.

  • docs: anistropy -> anisotropy

  • docs: api -> API

  • docs: hud -> HUD

  • docs: fbo -> FBO

  • docs: gcc -> GCC

  • docs: clang -> Clang

  • docs: s3tc -> S3TC

  • spirv: correct sematic-typo

  • libgl-gdi: support building without softpipe

  • gallium/util: do not pass undefined sample-count

  • softpipe: correct signature of get_compiler_options

  • mesa/main: add missing include in glformats.h

  • zink: more accurately track supported blits

  • zink: fix layered resolves

  • zink: fall back to util_blitter for scaled resolves

  • docs: document new zink-flag

  • zink: do not require VK_KHR_external_memory

Felix Yan (1):

  • Correct a typo in threads_win32.h

Gert Wollny (81):

  • gallium + mesa/st: Add PIPE_CAP_NIR_ATOMICS_AS_DEREF and use it

  • r600: Set PIPE_CAP_NIR_ATOMICS_AS_DEREF to true

  • r600/sfn: Sort uniforms by binding and offset

  • r600/sfn: add r600 specific lowering pass for atomics and use it

  • r600/sfn: Add a mapping table for atomics

  • r600/sfn: correct allocating and emitting of atomics

  • r600/sfn: Correct ssbo instruction handling

  • r600/sfn: handle querying SSBO size

  • r600/sfn: Force a minimum of 4 GPRs, it seems to fix atomics

  • r600: Enable compute shaders for NIR code path

  • compiler/nir: rewrite lower_fragcoord_wtrans to use nir_lower_instructions

  • compiler/nir: extend lower_fragcoord_wtrans to support VARYING_SLOT_POS

  • gallium/aux: reorder vertex attributes in triangle fans according to PV

  • meson: Make some warnings handled as errors with MSVC

  • r600: revert disabling llvm draw

  • r600/nir: fetch sources and split uniforms before emittting alu instructions

  • r600/sfn: correct ring op patching

  • r600/sfn: Fix loading vertex attributes

  • r600/sfn: clone shader before lowering to registers and src/dest modifiers

  • r600/sfn: Fix ordering of tex param moves

  • r600/sfn: avoid some copies

  • r600/sfn: Lower *sign opcodes in nir

  • r600/sfn: Fix split_alu_modifiers

  • r600/sfn: Fix bitfield ops and 2x16 split_y

  • r600/sfn: Fix source swizzle for gradient queries

  • r600/sfn: more fixing of vec4 fetching

  • r600/sfn: Fix comparison with different signedness

  • nir: Add option lower_uniforms_to_ubo

  • radeonsi: set compiler flag lower_uniforms_to_ubo

  • freedreno/ir3: set lower_uniforms_to_ubo compiler flag

  • intel/compiler: Set lower_uniform_to_ubo compiler flag

  • llvmpipe: set lower_uniform_to_ubo compiler flag

  • gallium+mesa/st: lower uniforms based on compiler flag instead of packed uniforms cap

  • r600: enable lowering uniforms to UBO

  • r600/sfn: Use load_ubo_vec4 lowering pass

  • nir: remove ubo_r600 instrinsic since ubo_vec4 is used now

  • r600/sfn: make number of source components a local variable

  • r600/sfn: Fix component count for fdph

  • r600/sfn: Fix typo in comment

  • r600/sfn: use cnde instead of cnde_int

  • r600/sfn: run late algebraic optimizations

  • r600/sfn: remove a useless if-condition

  • r600: Add flag for dual-source blending to shader key

  • r600/sfn: Sort the outputs of the FS according to data index

  • r600/sfn: Keep info about dual-source blend in FS

  • r600/sfn: Handle number of color outputs taking dual source blending into account

  • r600/sfn: Take dual source blending output indices into account

  • r600/sfn: Acquire the number of FS outputs and the write_all info early

  • r600/sfn: Be a bit more verbose when logging skipped FS outputs

  • r600/sfn: Fix emitting shared atomics with constant sources

  • r600/sfn: Handle nir_op_b2b32

  • r600/sfn: lower to scalar for some optimizations and vectortize later

  • r600/sfn: Support group memory barrier

  • r600/sfn: save some instructions when doing multisample on sample 0

  • r600/sfn: use fine gradient evaluation for interpolate_at_offset

  • r600/sfn: Fix interpolate at sample

  • r600/sfn: Fix indirect const buffer access

  • r600/sfn; go back to not lowering uniforms to UBOs

  • r600/sfn: replace hand-backed literal check by NIR function

  • r600/sfn: remove old code to track uniforms as it is no longer needed

  • r600/sfn: Add support for helper invocations

  • r600/sfn: Fix enabling the right interpolator for inerpolate_at_sample

  • r600/sfn: Fix IDX register ID

  • r600/sfn: Add support for more barrier instructions

  • r600/sfn: extend life range of all variables by one

  • r600/sfn: Don’t reuse registers for workgroup ID and local invocation ID

  • r600/sfn: Fix ssbo resource offset for buffer loads

  • r600/sfn: Fix keepalive patch

  • r600/sfn: fix mega fetch count for SSBO/Image atomics result fetch

  • r600/sfn: Rework get_temp_register to return a smart pointer to GPRValue

  • r600/sfn: use shared pointer to GPR for FS sysvalues

  • r600/sfn: Handle mem barrier and image barrier by using ACK

  • r600/sfn: use cacheless op for coherent image write

  • r600/sfn: use 32 bit bools

  • r600/sfn: fix remapping of deleted attributes

  • r600/sfn: Use register keep-alive also when scanning the shader

  • r600/sfn: Fix the parameter component type

  • r600/sfn: Update state docu

  • compile/nir: Correct printing dest_type

  • r600/sfn: lower bool to int32 only after common optimizations

  • r600/sfn: fix component loading from fixed buffer ID

Greg V (1):

  • radv,anv: use CLOCK_MONOTONIC_FAST when CLOCK_MONOTONIC_RAW is undefined

Guido Günther (1):

  • kmsro: Extend to include imx-dcss

Gurchetan Singh (7):

  • virgl: add flags to (*resource_create) callback

  • drm-uapi: virtgpu_drm.h: resource create blob + host visible memory region

  • virgl/drm: query for resource blob and host visible memory region

  • virgl/drm: add resource create blob function

  • virgl: support PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT

  • virgl: query blob mem

  • virgl: fix stride + layer_stride inconsistency

Hoe Hao Cheng (7):

  • zink: generate extension infrastructure using a python script

  • zink: hook zink_device_info.py to build system

  • zink: use the new extension infrastructure in device creation

  • zink: use the new, generated extension infrastructure

  • zink: remove old extension infrastructure

  • zink: implement pipe_device_reset_callback

  • zink: call the reset callback not only during a status check

Hyunjun Ko (4):

  • freedreno: support GL_EXT_semaphore

  • turnip: Refactor structs of tu_query

  • turnip: Support pipeline statistics query

  • turnip: Implement VK_EXT_host_query_reset

Iago Toral Quiroga (443):

  • v3d/compiler: fix V3D double-rounding of .8 fixed-point XY coordinates

  • v3dv: add support for valgrind macros

  • v3dv: implement vkCreateInstance

  • v3dv: implement vkDestroyInstance

  • v3dv: implement vkEnumeratePhysicalDevices

  • v3dv: pretend to initialize a physical device

  • v3dv: Implement vkGetPhysicalDeviceProperties

  • v3dv: retrieve device name from device info

  • v3dv: add a comment to clarify how we should implement uuid / deviceID retrieval

  • v3dv: implement vkGetPhysicalDeviceMemoryProperties

  • v3dv: implement vkGetPhysicalDeviceFeatures

  • v3dv: implement vkEnumerateDeviceExtensionProperties

  • v3dv: amend vkEnumerateInstanceExtensionProperties to handle layers

  • v3dv: implement vkGetPhysicalDeviceQueueFamilyProperties

  • v3dv: implement vkCreateDevice

  • v3dv: implement vkGetDeviceQueue

  • v3dv: add dummy implementations for the packet definition generator

  • v3dv: add stubs for the format table and vkGetPhysicalDeviceFormatProperties

  • v3dv: add some basic support for format properties

  • v3dv: implement vkEnumerate{Instance,Device}LayerProperties

  • v3dv: add stub for vkDeviceWaitIdle

  • v3dv: implement vkCreateImage

  • v3dv: implement vkGetImageMemoryRequirements

  • v3dv: initialize mememory heaps in the physical device

  • v3dv: implement vkAllocateMemory

  • v3dv: implement vkFreeMemory

  • v3dv: implement vkMapMemory

  • v3dv: implement vkUnmapMemory

  • v3dv: implement vkBindImageMemory

  • v3dv: implement vkCreateImageView

  • v3dv: implement vk{Create,Destroy}Buffer

  • v3dv: implement vkGetBufferMemoryRequirements

  • v3dv: implement vkBindBufferMemory

  • v3dv: implement vkCreateRenderPass

  • v3dv: implement vk{Create,Destroy}RenderPass

  • v3dv: implement vk{Create,Destroy}Framebuffer

  • v3dv: implement vkCreateCommandPool

  • v3dv: implement vk{Allocate,Free}CommandBuffers

  • v3dv: create a v3dv_bo struct and reference it from v3dv_device_memory

  • v3dv: add a concept of a command list

  • v3dv: implement vkBeginCommandBuffer

  • v3dv: start handling command buffer status

  • v3dv: implement vkGetPhysicalDeviceImageFormatProperties

  • v3dv: make v3dv_bo_alloc allocate memory for the bo struct

  • v3dv: compute tile size for framebuffer

  • v3dv: implement vkCmdBeginRenderPass

  • v3dv: make the command buffer own the command list BOs

  • v3dv: add a few more API stubs

  • v3dv: store base mip level in the image view

  • v3dv: add the tile state and alloc BOs to the command buffer BO list

  • v3dv: revert the decision that the command buffer takes ownership of BOs

  • v3dv: implement vkDestroyImage and vkDestroyImageView

  • v3dv: make v3dv_layer_offset public

  • v3dv: plug leak when destroying device

  • v3dv: precompute more tiling info at framebuffer creation time

  • v3dv: emit scissor to render area and precompute hw color clear values

  • v3dv: emit the render command list

  • v3dv: implement vkEndCommandBuffer

  • v3dv: create the command buffer BO set before we init CLs

  • v3dv: keep track of the numbre of BOs in a command buffer

  • v3dv: clear set of BOs in the command buffer on reset

  • v3dv: implement vkQueueSubmit

  • v3dv: be more conservative resetting command buffer state

  • v3dv: setup color clear values at subpass setup time

  • v3dv: emit tile loads

  • v3dv: flush at the end of each subpass

  • v3dv: split framebuffer internal bpp calculations from tiling calculations

  • v3dv: rename and make compute_tile_size_for_framebuffer() public

  • v3dv: implement vkCmdCopyImageToBuffer

  • v3dv: add the concept of a job

  • v3dv: implement vkCmdNextSubpass

  • v3dv: use the correct miplevel slice for the tile load operation

  • v3dv: implement vkCmdPipelineBarrier

  • v3dv: do not automatically emit a binner flush when finishing jobs

  • v3dv: fix clipping against render area

  • v3dv: add a note on interactions between clearing and scissor

  • v3dv: rewrite attachment state tracking

  • v3dv: only clear attachments on the first subpass that uses them

  • v3dv: merge subpasses into the same job when possible

  • v3dv: fix tile buffer loading

  • v3dv: rewrite the attachment needs clearing condition

  • v3dv: create a helper to start a new frame

  • v3dv/cmd_buffer: rename render pass RCL emission helpers to be more explicit

  • v3dv: handle VK_ATTACHMENT_UNUSED properly in more places

  • v3dv: implement vkDeviceWaitIdle

  • v3dv: implement vk{Create,Destroy}Semaphore

  • v3dv: implement semaphore waits and signals on queue submissions

  • v3dv: implement fences

  • v3dv: support queue submissions with multiple command buffers

  • v3dv: implement vkGetPhysicalDeviceSparseImageFormatProperties

  • v3dv: include Vulkan version 1.1 as unsupported.

  • v3dv: implement VK_KHR_get_physical_device_properties2

  • v3dv: implement VK_KHR_external_memory_capabilities

  • v3dv: implement VK_KHR_external_memory{_fd,_dma_buf}

  • v3dv: fix copy image to buffer

  • v3dv: implement vkGetImageSubresourceLayout

  • v3dv: implement DRM modifier setup for WSI

  • v3dv: hook up WSI support

  • v3dv: implement device detection on actual hardware

  • v3dv: allocate winsys BOs properly

  • v3dv: rename drm device fields so they are more explict

  • v3dv: don’t swap RB channels when copying images to buffers

  • v3dv: implement support for depth testing

  • v3dv: don’t always skip tile buffer stores

  • v3dv: compute subpass ranges for attachments at render pass creation time

  • v3dv: select the depth/stencil buffer from the attachment aspect mask

  • v3dv: select correct internal type for depth/stencil formats

  • v3dv: support depth testing on combined depth/stencil formats

  • v3dv: implement stencil testing

  • v3dv: fix indentation

  • v3dv: support copying depth/stencil aspects to buffer

  • v3dv: fix viewport state from pipeline

  • v3dv: implement early Z optimization

  • v3dv: clamp stencil masks and reference value to supported limits

  • v3dv: implement dynamic stencil states

  • v3dv: fix the mess with dynamic state handling

  • v3dv: add a helper to compute the hardware clear color

  • v3dv: add a helper to get the Z/S buffer from an aspect mask

  • v3dv: implement vkCmdClearAttachments

  • v3dv: implement indexed draws

  • v3dv: fix clockwise primitive setting

  • v3dv: ignore image view aspects for depth/stencil attachments

  • v3dv: take the number of layers from the framebuffer

  • v3dv: Add more supported formats to our format table

  • v3dv: don’t advertise texel buffer support yet.

  • v3dv: implement vkCmdCopyBuffer

  • v3dv: implement vkCmdUpdateBuffer

  • v3dv: implement vkCmdFillBuffer

  • v3dv: move the framebuffer setup code for buffer copy/fill to a helper

  • v3dv: add a concept of a fake framebuffer for meta-copy operations

  • v3dv: refactor common code in meta copy operations

  • v3dv: fix copy size for image to buffer copies

  • v3dv: implement vkCmdCopyImage

  • v3dv: implement vkCmdClearColorImage

  • v3dv: fix buffer automatic stride for image to buffer copies

  • v3dv: implement vkCmdClearDepthStencilImage

  • v3dv: implement vkCmdCopyBufferToImage for color formats

  • v3dv: vkCmdCopyBufferToImage for depth/stencil formats

  • v3dv: add an assert to catch applications trying to clear invalid aspects

  • v3dv: implement indirect draws

  • v3dv: add support for primitive restarts on indexed draw calls

  • v3dv: initialize in_sync_bcl in our submits

  • v3dv: implement vkResetCommandBuffer

  • v3dv: add assertions for unimplemented fallback paths

  • v3dv: honor swizzle for non-copy operations of color formats

  • v3dv: implement vkQueueWaitIdle

  • v3dv: destroy wsi device during physical device termination

  • v3dv: implement vk{Create,Destroy}BufferView

  • v3dv: implement host-side event handling functions

  • v3dv: adjust a few limits to comply with CTS minimum requirements

  • v3dv: declare that we support robust buffer access

  • v3dv: meet requirements for supported format features and properties

  • v3dv: implement vkResetCommandPool

  • v3dv: don’t swap R/B channels for VK_FORMAT_R5B6G5_UNORM_PACK16

  • v3dv: don’t use TLB path for formats that are not supported for rendering

  • v3dv: fix image clearing with VK_REMAINING_*

  • v3dv: don’t support image formats that we can rendet to or texture from

  • v3dv: fix fill buffer with VK_WHOLE_SIZE

  • v3dv: implement vkGetRenderAreaGranularity

  • v3dv: fix supertile coverage when render are size is 0.

  • v3dv: take memory format from appropriate miplevel for image load/store

  • v3dv: fix framebuffer format when computing fragment shader key

  • v3dv: fix subpass tracking in the command buffer state

  • v3dv: rewrite frame tiling setup

  • v3dv: more frame tiling refactors

  • v3dv: trivial refactors in a few meta copy helpers

  • v3dv: assign driver locations on fragment shader output variables

  • v3dv: don’t reset loader data on command buffers

  • v3dv: drop incorrect assertion

  • v3dv: add a no-op fragment shader if we don’t have one

  • v3dv: implement interpolation qualifiers

  • v3d/compiler: implement nir_op_fquantize2f16

  • v3dv: call nir_lower_io_arrays_to_elements_no_indirects on vertex shaders

  • v3dv: fix incorrect sizing of the vertex attribute state array

  • v3dv: split fragment shader array outputs

  • v3dv: lower usubborrow and uaddcarry

  • v3dv: lower {i,u}mulExtended

  • v3dv: don’t assume that VkPipelineColorBlendStateCreateInfo is provided

  • v3dv: drop incorrect assertion

  • v3dv: drop assert for map of a mapped buffer

  • v3dv: fix image tiling configuration

  • v3dv: fix scissor outside viewport

  • v3dv: fix viewport Z

  • v3dv: work around viewport Z scale hardware bug

  • v3dv: don’t leak job allocations

  • v3dv: handle the case where we fail to allocate a new job gracefully

  • v3dv: only export the last job sync object once

  • v3dv: support submits without a command buffer

  • v3dv: return OOM error if we fail to import or export sync objects

  • v3dv: use vk_error() for all queue/submit errors

  • v3dv: fix copies and clears of 3D images

  • v3dv: fix depth/stencil clear color

  • v3dv: implement color blending

  • v3dv: only expose blending on formats that support it

  • v3dv: add an ‘always flush’ mode

  • v3dv: always flush draw calls if we are doing sRGB blending

  • v3dv: implement dynamic state for blend constants

  • v3dv: only emit blend state if the pipeline is dirty

  • v3dv: rewrite dirty state handling

  • v3dv: drop redundant emission of stencil state

  • v3dv: stencil state fixes

  • v3dv: only emit config bits and varyings packets if needed

  • v3dv: use perp end caps rasterization mode for lines

  • v3dv: drop incorrect assertion on number of clear values at render pass begin

  • v3dv: disable depth/stencil testing if we don’t have a depth/stencil attachment

  • v3dv: assert on vkCreateComputePipelines

  • v3dv: improve assert handling for fallback paths on meta copy/clear operations

  • v3dv: check support for transfer usage flags

  • v3dv: make sure we only expose transfer features for formats we can use

  • v3dv: use compatible TLB formats if possible during copies and clears

  • v3dv: fix incorrect image slice selection

  • v3dv: fix clearing of 3D images

  • v3dv: fix job subpass index for vkCmdClearAttachments jobs

  • v3dv: don’t emit the subpass RCL for jobs that have emitted their own

  • v3dv: fix a1r5g5b5 format

  • v3dv: allow to create shader modules from NIR

  • v3dv: improve asserts for VkPipelineColorBlendStateCreateInfo handling

  • v3dv: implement partial color attachment clears

  • v3dv: implement partial depth/stencil attachment clears

  • v3dv: implement proper caching for partial clear pipelines

  • v3dv: store the clip window in the command buffer state

  • v3dv: check the render area against the clip window

  • v3dv: fix v3dv_GetRenderAreaGranularity to account for attachment bpp

  • v3dv: don’t always assert that we have an active job

  • v3dv: use the TLB to clear attachments even if we have an active scissor

  • v3dv: restrict render pass clears to the render area

  • v3dv: handle stencil load/store operations

  • v3dv: assert on subpasses that use input or resolve attachments

  • v3dv: push/pop more state during meta operations

  • v3dv: create a v3dv_cmd_buffer_subpass_resume helper

  • v3dv: set render area for partial clears to match clear rect

  • v3dv: compute tile granularity for each subpass

  • v3dv: fix incorrect attachment reference

  • v3dv: fix incorrect attachment reference

  • v3dv: simplify partial clearing code

  • v3dv: handle partial clears of just one aspect of combined DS targets

  • v3d/compiler: implement nir_intrinsic_load_base_instance

  • v3dv: emit instanced draw calls when requested

  • v3dv: fix subpass merge tests

  • v3dv: reset all state to dirty when we start a new job for a command buffer

  • v3dv: implement occlusion queries

  • v3dv: submit a no-op job if a command buffer doesn’t have any jobs.

  • v3dv: simplify handling of no-op jobs

  • v3dv: add a bunch of API stubs

  • v3dv: implement TFU blits

  • v3dv: reset subpass index at render pass end

  • v3dv: meta operations can happen outside a render pass

  • v3dv: save and restore descriptor state during meta operations if needed

  • v3dv: save and restore push constant state during meta operations

  • v3dv: implement shader draw fallback for vkCmdBlitImage

  • v3dv: require optimal tiling for features that reqiure sampling

  • v3dv: move early-Z update to pre-draw

  • v3dv: don’t leak NIR code in pipelines

  • v3dv: don’t leak host memory allocated for shader variants

  • v3dv: don’t leak default pipeline attributes BO

  • v3dv: don’t leak prog_data from shader variants

  • v3dv: don’t leak the compiler from the physical device

  • v3dv: don’t leak the texture shader state BO from image views

  • v3dv: don’t leak state BO from samplers

  • v3dv/blit: fix integer blits from larger to lower bit size

  • v3dv: handle miplevel correctly for blits

  • v3dv: support depth blits

  • v3dv: don’t support blitting of combined depth/stencil formats

  • v3dv: don’t support 1D depth/stencil for transfer sources or sampling

  • v3dv: remove incorrect assert

  • v3dv: support blits with 1D and 3D images

  • v3dv: add framework for private driver objects

  • v3dv: fix leaks during recording of meta blits

  • v3dv: use the private object framework in the meta clear path

  • v3dv: implement fallback for partial image copies

  • v3dv: implement stencil aspect blits for combined depth/stencil format

  • v3d: fix Tile Rendering Mode Cfg (Color) packet description

  • v3dv: limit software integer RT clamp to rgb10a2

  • v3dv: handle copies from/to compressed formats

  • v3dv: implement partial buffer copies to color images

  • v3dv: support blitting both depth and stencil aspects at the same time

  • v3dv: implement partial buffer copies to depth/stencil images

  • v3dv: always return true from a fallback path if it can handle the case

  • v3dv: fix image addressing calculations to account for suballocation

  • v3dv: only require 4-byte alignment for linear images

  • v3dv: implement partial image to buffer copies

  • v3dv: do not rewrite blit spec for combined depth/stencil in get_blit_pipeline

  • v3dv: drop blit path for depth/stencil formats

  • v3dv: implement depth bias

  • v3dv: ignore dynamic updates of depth bounds state

  • v3dv: implement wide lines

  • v3dv: fix dynamic blend constants

  • v3dv: fix the command buffer private object framework for 32-bit

  • v3dv: fix depth/stencil clears on hardware

  • v3dv: make the driver more robust against OOM

  • v3dv: implement events

  • v3dv: don’t leak BOs from CLs when using BRANCH

  • v3dv: fix vkResetCommandPool

  • v3dv: make TLB clearing paths return true/false

  • v3dv: drop the extra BO handling from the command buffer

  • v3dv: remove some unnecessary / unused functions

  • v3dv: assert command buffers are executable when submitting to a queue

  • v3dv: check that GPU device matches requirements

  • v3dv: ensure BCL space is available before emitting packets

  • v3dv: handle OOM properly during command buffer recording in more places

  • v3dv: fix bogus command buffer allocation scopes

  • v3dv: add basic support for secondary command buffers

  • v3dv: implement vkCmdWaitEvents for secondary command buffers

  • v3dv: support vkCmdClearAttachments in secondary command buffers

  • v3dv: don’t leak attachment state

  • v3dv: add stubs for missing API implementations

  • v3dv: warn users that this is not a conformant driver

  • v3dv: fix BCL start offset in presence of chained BOs

  • v3dv: regen BO lists for CLs inside cloned jobs

  • v3dv: fix a few cases where we were ignoring suballocated buffers

  • v3dv: fix release build warnings

  • v3dv: actually enable early Z

  • v3dv: try harder to skip emission of redundant state

  • v3dv: add a TFU path for buffer to image copies

  • v3dv: add a CPU path for buffer to image copies

  • v3dv: try to use TFU path when creating tiled images from linear buffers

  • v3dv: always map full BOs

  • v3dv: support compute pipelines

  • v3dv: handle separate binding points for compute and graphics

  • v3dv: implement compute dispatch

  • v3dv: handle unsized arrays in SSBOs

  • v3dv: always emit index buffer state for new jobs

  • v3dv: implement indirect compute dispatch

  • v3dv: return a proper error for too large buffer allocations

  • v3dv: assert that our framebuffers are single sampled

  • v3dv: don’t free BOs from imported memory objects

  • v3dv: pipeline initialization fixes for disabled rasterization

  • v3dv: handle empty set layouts

  • v3dv: don’t reset descriptor state after a meta operation

  • v3dv: lower unpack_{u,s}norm_2x16

  • v3dv: lower frexp

  • v3dv: implement support for shader spilling

  • v3dv: fix GFXH-930 workaround

  • v3dv: add workaround for GFXH-1602

  • v3dv: improve handling of too large image sizes

  • v3dv: handle draw after barrier

  • v3dv: fix vkCmdCopyBuffer unaligned TLB access

  • v3dv: fix textureSize() for cube arrays

  • v3dv: fix srcSubresource description for image to buffer blits

  • v3dv: fix blit_shader() to honor the region’s aspect mask

  • v3dv: handle unnormalized coordinates in samplers

  • v3dv: use swizzle X001 with D/S formats

  • v3dv: fix regressions for cubemap array load/store

  • v3dv: fix color border clamping with specific formats

  • v3dv: make sure we emit vertex attributes in location order

  • v3d/compiler: support swapping R/B channels in vertex attributes.

  • v3dv: handle VK_FORMAT_B8G8R8A8_UNORM vertex attributes

  • v3dv: don’t support sRGB buffer formats

  • v3dv: improve pipeline barrier handling

  • v3dv: use a binning sync for CL jobs waiting on a semaphore

  • v3dv: ignore stencil load operation if attachment format doesn’t have stencil

  • v3dv: only use per-buffer clear bit for cases were we are already storing

  • v3dv: avoid prime blit path when presenting WSI images

  • v3dv: only care about barriers between GPU jobs

  • v3dv: emit new shader state if viewport is dirty

  • v3dv: only clear depth/stencil attachments if any aspect needs clearing

  • v3dv: add a fast path for vkCmdClearAttachments

  • v3dv: enable shaderClipDistance

  • v3dv: enable fillModeNonSolid

  • v3dv: fix dynamic state after meta operation

  • v3dv: consider MSAA when computing frame tiling

  • v3dv: process VkPipelineMultisampleStateCreateInfo properly

  • v3dv: implement subpass multisample rendering and resolve

  • v3dv: implement vkCmdResolveImage for whole images

  • v3dv: handle multisampled image copies in the TLB path

  • v3dv: setup texture shader state correctly for multisampled images

  • v3dv: add a blit fallback path for vkCmdResolveImage

  • v3dv: handle multisampled image copies with the blit path

  • broadcom/compiler: handle gl_SampleMask writes in fragment shaders

  • v3dv: amend tile size tables with smallest tile sizes available

  • nir/glsl: add a glsl_ivec4_type() helper

  • v3dv: fix blitting of signed integer formats

  • v3dv: handle multisample resolve of integer formats

  • v3dv: handle multisample resolves for formats that don’t support TLB resolves

  • v3dv: handle multisample image clears

  • broadcom/compiler: implement nir_intrinsic_load_sample_pos

  • broadcom/compiler: track if the fragment shader forces per-sample MSAA

  • v3dv: enable sample rate shading if fragment shader reads gl_SampleID

  • v3dv: implement nir_texop_texture_samples

  • v3dv: handle multisample rasterization with empty framebuffers

  • nir/lower_io: add an option to lower interpolateAt functions

  • v3dv: lower interpolateAt functions in NIR and enable sample rate shading

  • v3dv: only require texel-size alignment for linear images

  • v3dv: fix 3D image blits

  • v3dv: don’t cache subpass color clear pipelines

  • v3dV: move meta init/finish to meta implementation files

  • nir: add a nir_get_ubo_size intrinsic

  • v3d/compiler: implement nir_intrinsic_get_ubo_size

  • v3dv: handle QUNIFORM_GET_UBO_SIZE

  • broadcom/compiler: rename QUNIFORM_GET_BUFFER_SIZE to QUNIFORM_GET_SSBO_SIZE

  • v3d/compiler: add a lowering pass for robust buffer access

  • v3dv: hook up robust buffer access

  • v3dv: fix color clear pipeline destruction for 32-bit architectures

  • v3dv: handle VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_MODIFIER_INFO

  • v3dv: expose DRM modifiers based on supported features

  • v3dv: fix offset computed by vkGetImageSubresourceLayout for array images

  • v3dv: fix size computed by vkGetImageSubresourceLayout for 3D images

  • v3dv: do not expose VK_IMAGE_USAGE_SAMPLED_BIT for swapchains

  • v3dv: signal semaphore/fence if needed after acquiring a swapchain image

  • v3dv: fix sampling from stencil aspect of a combined depth/stencil image

  • v3dv: honor VkPipelineDepthStencilStateCreateInfo::depthWriteEnable

  • v3dv: don’t leak dumb BO handles allocated for swapchain images

  • v3dv: clean-up after obtaining an XCB connection

  • v3dv: free noop job if needed when finishing the queue

  • v3d/compiler: allow to batch spills

  • v3dv: always program a reasonable internal depth type for copies/clears

  • v3dv: only advertise one memory type

  • v3dv: flag tmu_dirty_rcl in primaries when linking secondaries that have it set

  • v3dv: implement workaround for GFXH-1461

  • v3dv: implement workaround for GFXH-1918

  • v3dv: fixes for barriers in secondary command buffers

  • v3dv: fix blit path for copies from 3D compressed images

  • v3dv: generate proper UUIDs for device and driver

  • v3dv: limit blit framebuffer dimensions to max coordinates

  • v3dv: drop a couple of obsolete comments

  • v3dv: fix buffer copies to compressed images on the blit path

  • broadcom/compiler: track partially interpolated fragment inputs

  • v3d/compiler: implement load interpolated input intrinsics

  • v3dv: skip unnecessary tile loads when blitting

  • v3dv: fix multi-layered buffer to image copies on the blit path

  • v3dv: do not attempt to blit from a linear image source

  • v3dv: fix Z coordinate for 3D blits

  • v3dv: handle compressed image to buffer copies on the blit path

  • v3dv: handle buffer to linear depth/stencil image copies in blit path

  • broadcom/cle: fix vec size dump when set to 0

  • v3d/compiler: fix BGRA vertex attributes for vec2/float size.

  • v3dv: compute swap_rb flag after applying all swizzles

  • v3dv: properly describe swap_color_rb

  • v3dv: enable the logicOp feature

  • v3dv: grow meta descriptor pool dynamically

  • v3dv: enable alphaToOne feature

  • v3dv: add image view debug checks for VK_KHR_maintenance1

  • v3dv: fix base slice selection for copies involving 3D images

  • v3dv: update assertion to match VK_KHR_maintenance1 semantics

  • v3dv: implement vkTrimCommandPool

  • v3dv: expose VK_KHR_maintenance1

  • v3dv: add support for timestamp queries

  • v3dv: fix occlusion query inheritance in secondary command buffers

  • zink: require Vulkan timestamp queries for time query caps

  • zink: add VK_STRUCTURE_TYPE_WSI_MEMORY_ALLOCATE_INFO_MESA for WSI allocations

  • v3dv: add a v3dv_bo_init helper

  • v3dv: expose more features

  • zink: fix pNext chain for resource memory allocation

Ian Romanick (34):

  • intel/vec4: Silence unused paramter warnings in brw_vec4_generator.cpp

  • intel/compiler: Silence unused parameter warning in brw_surface_payload_size

  • intel/compiler: Don’t fallback to vec4 when scalar GS compile fails [v2]

  • intel/vec4: Remove inline lowering of LRP

  • intel/compiler: Remove INTEL_SCALAR_… env variables

  • intel/vec4: Remove all support for Gen8+ [v2]

  • intel/vec4: Remove everything related to VS_OPCODE_SET_SIMD4X2_HEADER_GEN9

  • i965: Allow viewport array extensions with allow_higher_compat_version

  • intel: Silence many unused parameter warnings in blorp_genX_exec.h

  • i965: Silence many unused parameter warnings in genX_blorp_exec.c

  • i965: Silence many unused parameter warnings in genX_state_upload.c

  • i965: Make MOCS index tables static const

  • i965: Rename gen10_emit_isp_disable to gen7_emit_isp_disable

  • intel: Disable all support for Gen10

  • intel/compiler: Remove Gen10-specific code

  • i965: Remove Gen10-specific state setup and workarounds

  • i965: Don’t build Gen10-specific files and libraries

  • intel: Remove Gen10-specific cache config code

  • intel/isl: Don’t generate Gen10-specific functions

  • iris: Don’t generate Gen10-specific functions

  • anv: Don’t generate Gen10-specific functions

  • intel: Remove Gen10-speicific perf support

  • intel: Remove Gen10-specific device entries

  • i965: Silence unused parameter warnings

  • mesa/st: Silence unused parameter warnings in st_context.c

  • mesa: Pass the correct caller string to _mesa_lookup_or_create_texture

  • glx: rework __glXCalculateUsableExtensions to be more readable

  • nir: Rename replicated-result dot-product instructions

  • mesa: Open-code hash walk in _mesa_HashPrint

  • mesa: Store the atlas Id in the gl_bitmap_atlas structure

  • i965: Get the gl_perf_query_object Id from the object

  • mesa: Remove the key parameter from the _mesa_HashWalk callback

  • mesa: Remove the key parameter from the _mesa_HashDeleteAll callback

  • intel/compiler: Rotate instructions ROR and ROL cannot have source modifiers

Icecream95 (27):

  • panfrost: Fix border colour

  • docs/features: Add missing Panfrost extensions

  • panfrost: Cleanup panfrost_get_param

  • panfrost: Remove old comment on broken depth reload

  • panfrost: Correctly set modifier_constant

  • panfrost: Seperate resource setup and bo creation

  • panfrost: Move tiled-linear conversion checking to a new function

  • panfrost: AFBC to linear layout conversion

  • pan/mdg: Fix spilling of non-32-bit types

  • panfrost: Set modifier_constant to true for exported resources

  • pan/mdg: Return a bool from midgard_nir_lod_errata

  • pan/mdg: Use nir_shader_instructions_pass for nir_lod_errata

  • pan/mdg: Use nir_shader_instructions_pass for fdot2 lowering

  • Revert “panfrost: Drop implicit blend pooling”

  • panfrost: Clamp uniform buffer size

  • panfrost: Handle non-positive viewport positions

  • panfrost: Remove redundant casts of viewport position

  • panfrost: Mark blit shaders as internal

  • pan/mdg: Infer whether to disassemble shaders from info.internal

  • panfrost: Add a debug flag to disable AFBC

  • panfrost: Precise occlusion query support

  • panfrost: Only enable occlusion queries when active

  • panfrost: Move zs format handling code out of the !afbc case

  • panfrost: Z16 depth buffer support

  • panfrost: AFBC compress Z16 depth buffers

  • panfrost: Fix AFBC blits of resources with faked RGTC

  • panfrost: Fix stack shift calculation

Igor V. Kovalenko (1):

  • r600: amend space check for chips older than EVERGREEN

Ilia Mirkin (1):

  • panfrost: enable DrawTransformFeedback*

Indrajit Kumar Das (5):

  • mesa: add NV_copy_depth_to_color support for nir

  • gallium: prepare framework for supporting GL_NV_shader_atomic_int64

  • mesa,glsl: add support for GL_NV_shader_atomic_int64

  • radeonsi: enable support for GL_NV_shader_atomic_int64

  • radeonsi/gfx10: fix stream index for multi-stream overflow query

Italo Nicola (12):

  • nir: add shared/global atomics to nir_get_io_offset_src()

  • panfrost: fix undefined value access on mir_set_intr_mask()

  • panfrost: add atomic_cmpxchg opcode

  • panfrost: add LDST_ADDRESS property to atomic ops

  • panfrost: introduce LDST_ATOMIC property

  • panfrost: add support for src[3] in LOAD_STORE ops

  • panfrost: add atomic ops infrastructure

  • panfrost: add support for atomics

  • nir/algebraic: fold some nested comparisons with ball and bany

  • pan/mdg: remove unused arg from ALU_CHECK_CMP and ALU_CASE_CMP

  • pan/mdg: map uabs_i/usub to i/uabsdiff

  • pan/mdg: fix LOCAL_STORAGE wls_instances packing

Iván Briano (1):

  • anv: restrict number of subgroups per group

James Park (16):

  • amd/addrlib: Fix warning list for msvc

  • radv: Increased const usage

  • util: Hide timespec_passed on Windows

  • radv: Only close local_fd when valid

  • ac,amd/llvm,radv: Initialize structs with {0}

  • util,radv,radv/winsys: Cross-platform rwlock API

  • util,ac,aco,radv: Cross-platform memstream API

  • util: Fix rwlock Windows include for MinGW

  • util/xmlconfig: Disable for Windows like Android

  • aco: Clean up some C++ usages

  • vulkan/util,vulkan/wsi,radv: Add typed outarray API

  • aco: Fix accidental copies, attempt two

  • nir: Stabilize compact_components sort

  • amd/llvm,aco: Replace VLA with alloca

  • radv,radv/winsys: Move RADV_MAX_IBS_PER_SUBMIT

  • radv: Fix leak in radv_amdgpu_winsys_destroy()

Jan Beich (1):

  • spirv: switch to util_bswap32 to improve portability

Jan Ziak (1):

  • Add driver override to envvars.rst

Faith Ekstrand (296):

  • iris: no-op implement set_compute_resources

  • iris: Implement set_global_binding

  • iris: Add support for serialized NIR

  • intel/cs_intrinsics: Handle 64-bit intrinsics

  • intel/compiler: Allow MESA_SHADER_KERNEL

  • iris: Use blob_write_uint32 for num_system_values

  • iris: Add a kernel_input_size field for compiled shaders

  • iris/disk_cache: Stop assuming stage == cache_id

  • iris: Copy dest size from the original intrinsic in setup_uniforms

  • iris: Upload kernel inputs with system values

  • iris: Add support for MESA_SHADER_KERNEL in the disk cache

  • nir: Add and use nir_foreach_block_unstructured helpers

  • nir/lower_goto_if: Document some data structures

  • nir/lower_goto_if: Clean up ralloc usage

  • nir/lower_goto_if: Use util/list instead of exec_list

  • nir/lower_goto_if: Rework handling of skip targets

  • nir/lower_goto_if: Rework some set union logic

  • nir/lower_goto_if: Sort blocks in select_fork

  • nir/lower_goto_if: Add a block_for_singular_set helper

  • nir/lower_goto_if: Replace a tripple loop with a double loop

  • nir/lower_goto_if: Add a route::outside set

  • nir/lower_goto_if: Add some debug prints

  • spirv: Add a MESA_SPIRV_FORCE_UNSTRUCTURED environment variable

  • nir/builder: Make nir_get_ptr_bitsize take a nir_shader

  • spirv: Don’t emit RMW for vector indexing in shared or global

  • clover/nir: Stop setting ubo_addr_format

  • clover/nir: Stop computing the global address format twice

  • clover/nir: Use the correct address mode for shared

  • nir: Initialize nir_ssa_def::live_index

  • nir/builder: Add a nir_iand_imm helper

  • nir/find_array_copies: Handle cast derefs

  • nir/large_constants: Handle incomplete derefs

  • compiler/types: Allow interfaces in get_explicit_type_for_size_align

  • nir/opt_large_constants: Fix a type/deref_type typo

  • nir: Add an LOD parameter to image_*_size

  • iris: Stop advertising PIPE_SHADER_IR_NIR_SERIALIZED

  • iris: Stop advertising clover-only caps

  • iris: ref/unref the GLSL type singleton in screen_create/destroy

  • iris: Normalize all compute shaders to MESA_SHADER_COMPUTE

  • iris: Always re-upload sysvals when we have kernel inputs

  • intel/fs: Fix an assert in load_scratch

  • intel/nir: Allow splitting a single load into up to 32 loads

  • clover/spirv: Don’t call llvm::regularizeLlvmForSpirv

  • clover: Call clang with -O0 for the SPIR-V path

  • nir: Report progress properly in nir_lower_bool_to_*

  • intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage

  • intel/nir: Rewrite the guts of lower_alpha_to_coverage

  • intel/nir: Clean up lower_alpha_to_coverage a bit

  • nir: Use a switch in nir_inline_function_impl

  • nir: Take a variable remap parameter in nir_inline_function_impl

  • intel/fs: Add support for vec8 and vec16 ops

  • intel/nir: Lower things with \> 4 components in lower_mem_access_bit_sizes

  • spirv: Support big-endian strings

  • spirv: Delete some dead workgroup variable handling code

  • nir: Rename num_shared to shared_size

  • nir: Improve the comment on num_inputs and friends

  • intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+

  • nir: Add a new nir_var_mem_constant variable mode

  • nir: Add a load_global_constant intrinsic

  • nir/lower_io: Use the variable mode for load_scratch_base_ptr checks

  • nir/lower_io: Add a build_addr_for_var helper

  • nir/lower_io: Add support for nir_var_mem_constant

  • nir: Allow opt_large_constants to be run with constant_data_size \> 0

  • spirv: Use nir_var_mem_constant for UniformConstant data in CL

  • intel/fs: Implement nir_intrinsic_load_global_constant

  • nouveau/nir: Implement load_global_constant

  • llvmpipe: Add support for load_global_constant

  • clover/nir: Use nir_var_mem_constant for __constant memory

  • spirv: Drop the constant_as_global as option

  • nir/lower_explicit_io: Assert that compute address sizes match derefs

  • clover: Use 64-bit offsets for shader_in on 64-bit GPUs

  • nir/clone: Add a helper for cloning most instruction types

  • intel/compiler: Get rid of the global compaction table pointers

  • intel/compiler: Get rid of struct gen_disasm

  • iris: Use gen_disassemble

  • intel/eu: Add some new helpers

  • intel/fs,vec4: Stuff the constant data from NIR in the end of the program

  • anv: Stop storing the shader constant data side-band

  • intel/eu: Include brw_compiler.h in brw_eu.h

  • intel/eu: Add a mechanism for emitting relocatable constant MOVs

  • intel/fs: Add support for a new load_reloc_const intrinsic

  • anv: Properly cache brw_stage_prog_data::relocs

  • nir/builder: Add load/store_global helpers

  • anv: Patch constant data pointers into shaders with using softpin

  • iris: Patch constant data pointers into shaders

  • intel/fs: Don’t copy-propagate stride=0 sources into ddx/ddy

  • intel/fs: Use a single untyped surface read for load_num_work_groups

  • intel/nir: Lower load_num_work_groups to 32-bit if needed

  • iris: Re-emit push constants if we have a varying workgroup size

  • intel/compiler: Handle all indirect lowering choices in brw_nir.c

  • nir/lower_indirect_derefs: Add a threshold

  • intel/nir: Stop using nir_lower_vars_to_scratch

  • nir: Don’t bail too early in lower_mem_constant_vars

  • clover: Call nir_lower_mem_constant_vars

  • compiler/types: Make booleans 32-bit for cl_size/align

  • nir/glsl: Add an explicit_alignment field to glsl_type

  • nir: Add alignment information to cast derefs

  • nir: Handle all array stride cases in nir_deref_instr_array_stride

  • nir: Add a helper for getting the alignment of a deref

  • nir/lower_io: Apply alignments from derefs when available

  • nir/opt_deref: Don’t remove casts with alignment information

  • nir/opt_deref: Remove restrictive alignment information from casts

  • spirv: Add pointer helper vars to OpCopyMemory

  • spirv: Propagate alignments to deref chains via casts

  • nir: Allow var_mem_global in nir_lower_vars_to_explicit_types

  • nir: Allow uniform in nir_lower_vars_to_explicit_types

  • clover: Use args.size() to compute new var locations

  • spirv: Stop counting inputs in entry_point_wrapper

  • clover/nir: Use lower_vars_to_explicit for uniform and global

  • spirv: Drop the OpenCL type layout code

  • anv: Set alignments on UBO/SSBO root derefs

  • compiler/types: Fix deserializing structs with >= 15 members

  • spirv: Improve the “Entry point not found” error message

  • spirv2nir: Rework argument handling

  • nir/lower_io: Fix the unknown-array-index case in get_deref_align

  • nir: Add a dominance validation pass

  • spirv: Run repair_ssa if there are discard instructions

  • intel/nir: Call validate_ssa_dominance at both ends of the NIR compile

  • nir: More NIR_MAX_VEC_COMPONENTS fixes

  • nir/idiv_const: Use the modern nir_src_as_* constant helpers

  • anv: Fix the target_bo assertion in anv_reloc_list_add

  • clover: Pull the stride from pipe_transfer for image maps

  • spirv: Access qualifiers are not a bitfield

  • spirv: Plumb access qualifiers through from image types

  • nir: Add a pass for lowering CL-style image ops to texture ops

  • intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP

  • nir: Rename get_buffer_size to get_ssbo_size

  • radeonsi: Only call nir_lower_var_copies at the end of the opt loop

  • spirv: vtn_fail with a nice message on unsupported rounding modes

  • nir/liveness: Consider if uses in nir_ssa_defs_interfere

  • compiler/types: Add glsl_baseN_t_type(bit_size) helpers

  • spirv: Use the new types helpers

  • nir: Add a new memcpy intrinsic

  • nir: Add a lowering pass to lower memcpy

  • spirv: Add support for OpCopyMemorySized

  • clover/nir: Call the memcpy lowering pass

  • nir: Allow creating variables with nir_var_mem_push_const.

  • nir/lower_io: Add support for push constants

  • anv,radv,tu,val: Call nir_lower_io for push constants

  • spirv: Use derefs for push constants

  • vallium: Stop using lower_ubo_ssbo_access_to_offsets

  • spirv: Delete the legacy offset/index UBO/SSBO lowering

  • nir/copy_propagate: Copy-prop into jump conditions

  • nir: Disallow goto and goto_if in clone and [de]serialize

  • nir/cf: Better handle intra-block splits

  • nir/validate: Improve the validation of blocks

  • nir/lower_goto_ifs: Don’t destroy SSA form in the process

  • nir/dominance: Use _mesa_set_clear instead ofhand-rolling it

  • spirv: Only run repair_ssa if structured

  • nir/lower_goto_ifs: Use rzalloc

  • nir/lower_goto_ifs: Add asserts for SSA forks

  • nir/lower_goto_ifs: Always include level dom_frontiers in prev_frontier

  • Revert “nir/lower_goto_if: Add a route::outside set”

  • anv: Allow HiZ clears for multi-view

  • anv: Use more temp vars in cmd_buffer_begin_subpass

  • anv: Skip HiZ and CCS ambiguates which preceed fast-clears

  • nir: Split NIR_INTRINSIC_TYPE into separate src/dest indices

  • nir: Add a conversion and rounding intrinsic

  • nir: Add builder helpers for OpenCL type conversions

  • nir: Add a passes for nir_intrinsic_convert_alu_types

  • spirv: Add some conversion handling helpers

  • spirv: Handle all OpenCL conversion ops with full rounding

  • spirv/opencl: Drop dest_type from handle_v_load_store

  • clover/nir: Call nir_lower_convert_alu_types

  • nir: Add lowering from regular ALU conversions to the intrinsic

  • intel/fs: NoMask initialize the address register for shuffles

  • nir: Fix a misspelling

  • nir/find_array_copies: Properly discard copies for casts

  • nir: Handle memcpy in copy_prop_vars and combine_stores

  • nir: Add a memcpy optimization pass

  • nir/opt_load_store_vectorize: Use bit sizes when checking mask compatibility

  • nir: Add component mask re-interpret helpers

  • nir/opt_deref: Add an instruction type switch

  • nir/opt_deref: Add an optimization for bitcasts

  • nir: Add a pass to lower vec3s to vec4s

  • intel/fs: Don’t use NoDDClk/NoDDClr for split SHUFFLEs

  • iris: Fix the constant data address calculation

  • anv: Implement VK_EXT_transform_feedback on Gen7

  • spirv: Make the clc_shader const

  • nir/constant_folding: Use the builder

  • nir/constant_folding: Use nir_shader_instruction_pass

  • nir: Validate constant initializers

  • nir/constant_folding: Fold load_deref of nir_var_mem_constant

  • iris: Add pipe-loader support

  • iris: Handle runtime-specified local memory size

  • iris: Add support for load_work_dim as a system value

  • iris: Fill out compute caps and enable clover support

  • gallium/pipe: Add a GALLIUM_PIPE_SEARCH_DIR override env var

  • util/xxd.py: Add an option for binary files

  • spirv: Add a shared libclc loader

  • spirv: Move nir_lower_libclc to src/compiler/spirv

  • intel/nir: Don’t try to emit vector load_scratch instructions

  • intel/nir: Lower load_global_constant in lower_mem_access_bit_sizes

  • i965: Take an isl_format in emit_buffer_surface_state

  • intel/fs: Add an alignment to VARYING_PULL_CONSTANT_LOAD_LOGICAL

  • intel/fs: Add an option to use dataport messages for UBOs

  • anv: Add a device parameter to format_for_descriptor_type

  • anv: Use format_for_descriptor_type for descriptor buffers

  • anv: Plumb the device into *bits_for_access_flags

  • anv: Use the data cache for indirect UBO pulls on Gen8+

  • iris: Use the data cache for indirect UBO pulls

  • clover: Stop leaking NIR shaders

  • nir/opt_deref: Fix the vector bitcast optimization

  • nir: Allow more deref modes in phis

  • intel/batch_decoder: Don’t clame vec4 vs/gs/tcs shaders on Gen11+

  • intel/fs: Copy the PTSS from g0 for scratch reads/writes

  • intel/fs: Add a SCRATCH_HEADER opcode

  • intel/fs/ra: Increment spill_offset as part of the emit_spill loop

  • intel/fs/ra: Refactor handling of Gen7 scratch reads

  • intel/fs/ra: Store the last non-spill VGRF node

  • intel/fs/ra: Sanity-check our IP counts

  • intel/fs/ra: Use a set to track added spill/fill instructions

  • intel/fs: Rework scratch handling on Gen9+

  • intel/fs: Allow constant-propagation into SAMPLEINFO and IMAGE_SIZE

  • anv: Go back to using the sampler for UBO pulls

  • Revert “iris: Use the data cache for indirect UBO pulls”

  • anv: Bump the number of update-after-bind descriptors to 1M

  • anv: Add a descriptor_count to descriptor sets

  • anv: Implement VariableDescriptorCount

  • iris: Flush caches based on brw_compiler::indirect_ubos_use_sampler

  • anv,iris: Use the data cache for UBO pulls on Gen12+

  • spirv: Add 0.5 to integer coordinates for OpImageSampleExplicitLod

  • nir/lower_io: Assert non-zero power-of-two alignments

  • compiler/types: Assert non-zero alignments in get_explicit_type_for_size_align

  • compiler/types: Allow images and samplers in get_explicit_type_for_size_align

  • clover/nir: Calculate sizes of images and samplers properly

  • clover/nir: Add an image lowering pass

  • spirv: Fix OpCopyMemorySized

  • nir/lower_memcpy: Don’t mask the store

  • docs: Specify when branch points happen

  • nir/validate: Explain why we don’t use nir_foreach_block

  • mesa/spirv: Lower variable initializers for global variables

  • nir/builder: Add a nir_ieq_imm helper

  • nir/phis_to_scalar: Use a deny-list for load_deref modes

  • nir: Handle incomplete derefs in split_struct_vars

  • nir: Use var->data.mode instead of deref->mode in a few cases

  • nir: Disallow writes to system values and mem_constant

  • nir/opt_find_array_copies: Allow copies from mem_constant

  • nir: Add and use some deref mode helpers

  • nir/lower_array_deref_of_vec: Use nir_deref_mode_must_be

  • nir/lower_io: Use nir_deref_mode_* helpers

  • nir/phis_to_scalar,gcm: Use nir_deref_mode_may_be

  • nir: Only force loop unrolling if we know it’s a in/out/temp

  • nir/vars_to_ssa: Use nir_deref_must_be

  • nir/vec3_to_vec4: Use nir_deref_must_be

  • nir: Use nir_deref_mode_may_be in deref optimizations

  • nir/find_array_copies: Prepare for generic pointers

  • nir/split_*_vars: Prepare for generic pointers

  • nir: Make nir_deref_instr::mode a bitfield

  • nir: Add support for generic pointers

  • spirv: Add generic pointer support

  • nir/opt_deref: Add a deref mode specialization optimization

  • nir/opt_deref: Add an optimization for deref_mode_is

  • nir/lower_io: Add a mode parameter to build_addr_iadd

  • nir/lower_io: Add a mode parameter to addr_format_is_*

  • nir/lower_io: Add support for 32/64bit_global for shared

  • nir/lower_io: Add support for lowering deref_mode_is

  • nir/lower_io: Support generic pointer access

  • nir/lower_io: Add a new 62bit_generic address format

  • nir/opt_intrinsics: Report progress for the gl_SampleMask optimization

  • nir/constant_folding: Use a switch in try_fold_intrinsic

  • nir/constant_folding: Use the standard variable naming convention

  • nir: Move constant folding of vote to opt_constant_folding

  • nir/constant_folding: Fold subgroup shuffle intrinsics

  • nir/opt_intrinsics: Refactor a bit

  • nir/opt_intrinsic: Optimize bcsel(b, shuffle(x, i), shuffle(x, j))

  • nir/find_array_copies: Don’t assume all children exist

  • nir/deref: Fix a typo

  • spirv: Add basic plumbing for ray-tracing capabilities

  • spirv: Remove a redundant vtn_fail_if

  • spirv: Add a guard for OpTypeForwardPointer storage classes

  • spirv: Pass the deref type to storage_class_to_mode for non-forward pointers

  • spirv: Add support for OpTypeAccelerationStructureKHR

  • spirv,nir: Add support for ray-tracing built-ins

  • nir/builder: Add a select_from_ssa_def_array helper

  • nir: Add intrinsics for object to/from world RT sysvals

  • nir: Add new variable modes for ray-tracing

  • spirv: Implement the new ray-tracing storage classes

  • nir,spirv: Add support for the ShaderCallKHR scope

  • spirv,nir: Add ray-tracing intrinsics

  • nir: Handle ray-tracing intrinsics and storage classes in copy-prop etc.

  • spirv: Update headers and metadata from latest Khronos commit

  • nir: Print formats on image intrinsics as text

  • nir: Validate image atomic formats

  • util,gallium: Add new 64-bit integer formats

  • compiler/types: Add 64-bit image types

  • nir: Allow 64-bit image atomics

  • spirv: Add support for SPV_EXT_shader_image_atomic_int64

  • nir/lower_bit_size: Don’t cast comparison results

  • nir/lower_bit_size: Pass a nir_instr to the callback

  • nir/lower_bit_size: Add support for lowering subgroup ops

  • intel/nir: Refactor lower_bit_size_callback

  • intel/nir: Lower 8-bit scan/reduce ops to 16-bit

  • intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+

  • intel/fs: Fix use of undefined value in fixup_nomask_control_flow

  • spirv: Call repair SSA for OpTerminateInvocation

Jesse Natalie (61):

  • nir: nir_range_analysis needs to be updated for vec16

  • u_debug_stack_test: Fix MSVC compiling by using ATTRIBUTE_NOINLINE

  • util/macros: Add ATTRIBUTE_NOINLINE definition for MSVC

  • glsl: Add ‘bare’ shadow sampler type

  • nir: Fix serialize/deserialize of void samplers/images

  • nir: Optimize mask+downcast to just downcast

  • nir: Add nir_address_format_32bit_offset_as_64bit

  • nir: Add nir_address_format_32bit_index_offset_pack64

  • nir/vtn: CL SPIR-V callers should specify address modes

  • mesa: Move ATTRIBUTE_NOINLINE for glsl_to_tgsi_visitor::visit_expression for MSVC

  • nir: Add fisnormal op

  • nir/vtn: Support SpvOpIsNormal via fisnormal

  • nir: Add fisfinite op

  • nir/vtn: Support SpvOpIsFinite via fisfinite

  • nir/vtn: Handle LessOrGreater deprecated opcode

  • nir/vtn: Support OpOrdered and OpUnordered opcodes

  • nir/glsl: Add glsl_get_cl_type_size_align helper

  • nir: Use ‘unsigned’ instead of enum types in nir_variable::data

  • wgl: Switch to Win10 version defines to enable usage of Win10 WGL callbacks

  • nir: Populate some places where existing system values were missing

  • nir: Add new system values and intrinsics for dealing with CL work offsets

  • nir: Move compute system value lowering to a separate pass

  • nir: Add options to nir_lower_compute_system_values to control compute ID base lowering

  • spirv: Use new global invocation offset system value

  • nir: Add a lowering pass to split 64bit phis

  • nir: Relax opt_if logic to prevent re-merging 64bit phis for loop headers

  • nir_lower_bit_size: Support lowering ops with differing source/dest sizes

  • nir: Implement mul_high lowering for bit sizes other than 32

  • nir: Remove 32bit restriction for uadd_carry optimization

  • nir: Add bit_count to lower_int64 pass

  • nir/vtn: SPIR-V bit count opcodes (core and extension) dest size mismatches nir

  • clover/nir/spirv: Use uniform rather than shader_in for kernel inputs

  • nir/vtn: Add type constant to image intrinsics

  • nir/vtn: Add support for kernel images to SPIRV-to-NIR.

  • nir/vtn: Use return type rather than image type for tex ops

  • nir/vtn: Handle integer sampling coordinates

  • nir/vtn: ImageSizeLod op can be applied to images

  • nir/vtn: Add intrinsics for CL image format/order queries

  • nir/vtn: Convert constant samplers to variables with data

  • nir_dominance: Use uint32_t instead of int16_t for dominance counters

  • nir: More NIR_MAX_VEC_COMPONENTS fixes

  • spirv: Handle OpTypeOpaque

  • glsl_type: Add packed to structure type comparison for hash map

  • nir_lower_system_values: Fix load_global_invocation_id to use base_work_group_id even with no base_global id

  • nir: Add an internal flag to shader_info

  • nir: Add glsl_base_type unsigned -> signed version helper

  • nir/vtn: Add handling for SPIR-V event variables

  • vtn/opencl: Rework type handling for CL extension opcodes

  • vtn/opencl: Add infrastructure for calling out to libclc

  • vtn/opencl: Implement a lot of opcodes via libclc

  • vtn/opencl: Rework handle_instr to be able to handle core SPIR-V opcodes via libclc

  • vtn/opencl: Hook up OpenCL async copy and group wait opcodes via libclc

  • vtn/opencl: Switch non-native trig to use libclc

  • vtn/opencl: Switch exp/pow/log to use libclc

  • vtn/opencl: Switch division-related ops to use libclc

  • vtn/opencl: Switch some nir-sequence ops to use libclc

  • vtn/opencl: Only use libclc ldexp when lower_ldexp is set

  • vtn/opencl: Switch fma to conditionally use libclc for 32bit floats

  • spirv: Implement vload[a]_half[n] and vstore[a]_half[n][_r]

  • util: Move xxd.py to util

  • util: Make xxd.py output char array instead of string

John Bates (1):

  • disk_cache: build option for disabled-by-default

Jonathan Gray (13):

  • util: unbreak endian detection on OpenBSD

  • util/anon_file: add OpenBSD shm_mkstemp() path

  • meson: build with _ISOC11_SOURCE on OpenBSD

  • meson: don’t build with USE_ELF_TLS on OpenBSD

  • meson: conditionally include -ldl in gbm pkg-config file

  • util: futex fixes for OpenBSD

  • util/u_thread: include pthread_np.h if found

  • anv: use os_get_total_physical_memory()

  • util/os_misc: add os_get_available_system_memory()

  • anv: use os_get_available_system_memory()

  • util/os_misc: os_get_available_system_memory() for OpenBSD

  • radv: remove seccomp includes

  • vulkan: make VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT conditional

Jonathan Marek (57):

  • panfrost: add missing dependency on midgard_pack.h

  • util/format: expose generated format packing functions through a header

  • turnip: implement VK_EXT_custom_border_color

  • turnip: remove dead tu_minify/typed_memcpy functions

  • turnip: delete a blit_image TODO that has already been resolved

  • turnip: fix CmdBlitImage with D32_SFLOAT_S8_UINT

  • turnip: rework format_to_ifmt

  • turnip: call packing functions directly for pack_gmem_clear_value

  • turnip: add missing tu_bo_list_add in CmdWriteTimestamp

  • freedreno/ir3: remove indirect input load

  • freedreno/ir3: improve handling of aliased inputs

  • freedreno/ir3: rework setup_{input,output} to make struct varyings work

  • freedreno/regs: add 7nm DSI PHY/PLL regs

  • turnip: delete tu_physical_device path field

  • turnip: delete unused tu_image fields

  • turnip: fix the type of tu_shader_module code field, delete unused sha1

  • turnip: delete unused “tu_cmd_buffer_upload”

  • turnip: remove some unnecessary regs init

  • turnip: rework vertex buffers draw state handling

  • turnip: device global bo list

  • turnip: avoid heap allocations in QueueSubmit when semaphores are used

  • freedreno/ir3: allow layer/viewport output for VS/GS/DS

  • freedreno/ir3: add view_zero to shader key

  • turnip: multiViewport and VK_EXT_shader_viewport_index_layer

  • vulkan/wsi/display: add option for display fence to signal syncobj

  • turnip: delete unused tu_fence_signal function

  • turnip: add a fd field to tu_device

  • turnip: require syncobj support

  • turnip: rework fences to use syncobjs

  • radv: fix incorrect ResetFences path for WSI fence

  • radv: use syncobj for wsi fence

  • turnip: fix wrong indentation in tu6_draw_common

  • turnip: move A6XX_RB_ALPHA_CONTROL write to init_hw

  • turnip: implement VK_EXT_extended_dynamic_state

  • turnip: remove unused cmd_buffer/device arguments in descriptor sets

  • turnip: delete unused/broken pipeline layout hashing code

  • turnip: initial implementation of VK_KHR_push_descriptor

  • turnip: clean up tu_device_memory

  • turnip: always create permanent syncobj for semaphore

  • turnip: set MSM_SUBMIT_SYNCOBJ_RESET for submit pWaitSemaphores

  • turnip: semaphores simplification (only syncobj semaphores supported)

  • turnip: rework GetSemaphoreFdKHR

  • turnip: rework ImportSemaphoreFdKHR

  • turnip: remove remaining uses of drmSyncobj helpers

  • turnip: share code between semaphores/fences + fence import/export

  • turnip: signal fence and semaphore in AcquireNextImage2KHR

  • turnip: implement legacy API functions separately

  • freedreno/cffdec: fix decoding of bindless descriptors

  • turnip: remove pre-emption marker

  • turnip: implement timestamp fences/semaphores for kgsl backend

  • turnip: rework android gralloc path so it doesn’t call tu_image_create

  • turnip: don’t implement CreateImage as two separate functions

  • turnip: LAYOUT_PREINITIALIZED is not different for optimal tiling

  • turnip: remove useless tu_image asserts

  • turnip: remove unnecessary/redundant tu_image fields

  • turnip: don’t always fallback to linear for mutable formats

  • turnip: enable VK_EXT_image_drm_format_modifier

Jordan Justen (4):

  • anv, iris: Set MediaSamplerDOPClockGateEnable for gen12+

  • anv: Drop warning about gen12 not being supported

  • intel/dev: Add device info for ADL-S

  • intel/mi_builder: Support gen11 command-streamer based register offsets

Jose Maria Casanova Crespo (7):

  • vc4: Avoid negative scissor caused by no intersection

  • nir/algebraic: optimize iand/ior of (n)eq zero when umax/umin not available

  • vc4: Enable lower_umax and lower_umin

  • vc4: enable lower_isign for VC4

  • vc4: Add missing load_ubo set_align in yuv_blit fs.

  • vc4: Add missing range_base/range at nir_load_ubos in yuv_blit fs.

  • vc4: Enable nir_lower_io for uniforms

Joshua Ashton (1):

  • zink: Fix 32-bit compilation

Juan A. Suarez Romero (7):

  • intel: split driver/device UUID generators

  • iris: plumb device/driver UUID generators

  • intel/uuid: use git-sha1/package for the driver UUID

  • st/mesa: initialize lower alpha func to ALWAYS

  • v3d/compiler: extend swapping R/B support to all vertex attributes

  • v3dv: mark the right bit to swap R/B vertex attributes

  • v3d: Add GL_ARB_vertex_array_bgra support

Julian Winkler (1):

  • nir: Add a structurizer

Karol Herbst (48):

  • util/set: add _mesa_set_intersects

  • spirv: rename vtn_emit_cf_list to vtn_emit_cf_list_structured

  • nir: Add a structured flag to nir_shader

  • nir: Add goto_if jump instruction

  • spirv: extract switch parsing into its own function

  • spirv: parse unstructured CFG

  • clover/nir: fix mem_shared by using address_format_32bit_offset

  • nv50/ir/nir: fix smem size

  • nv50/ir/nir: rework indirect function_temp handling

  • clover/nir: Call vars_to_explicit_types for shared memory

  • nve4: fix uploading unaligned sized input buffers

  • nv50/ir/nir: assert on unknown alu ops

  • clover/nir: support int64 atomics if the device supports it

  • nv50/ir/nir: fix global_atomic_comp_swap

  • nvc0: handle nr being 0 in nvc0_set_global_bindings

  • nv50/ir/nir: support load_work_dim

  • clover/spirv: rework handling of spirv extensions

  • clover/spirv: pass list of supported extensions to the translator

  • nir: rename nir_op_fne to nir_op_fneu

  • nir: fix nir_variable_create for kernels

  • clover/nir: add support for global invocation id offsets

  • nv50/ir: remove symbol table support for compute shaders

  • nv50/ir: add nv50_ir_prog_info_out

  • nir: use nir_var_all to get rid of casting

  • util: add helpers to define bitwise operators on enums for C++

  • nir: use enum operator helper for nir_variable_mode and nir_metadata

  • clover/nir: Lower function_temp to scratch.

  • nv50/ir: fix cas lowering for 64 bit

  • clover/nir: use offset for temp memory

  • clover/llvm: undefine __IMAGE_SUPPORT__ for devices without image support

  • nvc0/ir: fix load propagation for sub 4 byte addressing

  • spirv: fix 64 bit atomic inc and dec

  • nvc0/cl: hande 64 bit pointers in nvc0_set_global_handle

  • clover/spirv: fix vec3 alignment

  • nir/serialize: fix serialization of system values

  • clover/util: add id_type_equals to support symbols with multiple sections

  • clover: bind constant buffer if one is provided

  • clover/nir: extract constant buffer into its own section

  • clover/spirv: parse arg_info

  • clover/spirv: support CL_KERNEL_COMPILE_WORK_GROUP_SIZE

  • clover: use pipe_image_view for images instead of set_compute_resources

  • clover: support custom driver strides

  • clover/device: use PIPE_MAX_SHADER_SAMPLER_VIEWS for max_images_read

  • clover/nir: set kernel_image cap

  • nouveau: hide SVM support behing a variable for now as kernel space is broken

  • nvc0/CL: enable images

  • llvmpipe: enable CL images

  • nv50/ir/nir: don’t use designated initializers

Kenneth Graunke (15):

  • iris: Fix headerless sampler messages in compute shaders with preemption

  • nir: Copy semantics to nir_intrinsic_load_fs_input_interp_deltas

  • nir: Move new edgeflag assert into the io_lowered case

  • iris: Reorder the loops in iris_fence_await() for clarity.

  • iris: Drop stale syncobj references in fence_server_sync

  • Revert “nir: replace lower_ffma and fuse_ffma with has_ffma”

  • intel/compiler, anv: Delete cs_prog_data->slm_size

  • iris: Fix doubling of shared local memory (SLM) sizes.

  • anv: Set only one ISL usage bit (RT/texture) for CopyBuffer sources

  • isl, anv, iris: Add a centralized helper to select MOCS based on usage

  • isl: Enable Tigerlake HDC:L1 caches via MOCS in various cases.

  • iris: fix source/destination layers for 3D blits

  • iris: Move blit scissoring earlier.

  • intel/fs: Fix sampler message headers on Gen11+ when using scratch

  • nir/algebraic: Avoid creating new fp64 ops when using softfp64

Khem Raj (1):

  • vc4: use intmax_t for formatted output of timespec members

Kristian Høgsberg (12):

  • egl/android: Call createImageFromDmaBufs directly

  • egl/android: Look up prime fds in droid_create_image_from_prime_fds()

  • egl/android: Drop unused ctx argument

  • egl/android: Simplify droid_create_image_from_name() path

  • egl/android: Move droid_create_image_from_prime_fds() function up

  • egl/android: Use droid_create_image_from_prime_fds() in get_back_bo()

  • egl/android: Add support for CrOS buffer info perform op

  • turnip: Add kgsl backend

  • util/formats: Add PIPE_FORMAT_R8_G8B8_420_UNORM

  • st/mesa: Add NV12 lowering to PIPE_FORMAT_R8_G8B8_420_UNORM

  • freedreno/a6xx: Generalize pointers in struct fd6_pipe_sampler_view

  • freedreno/a6xx: Support PIPE_FORMAT_R8_G8B8_420_UNORM for texturing

Krunal Patel (2):

  • gallium/auxiliary/vl: Odd Dimensions are failing

  • radeon/vcn: Bitrate not updated when changing framerate

Leo Liu (2):

  • frontends/omx/dec: Use the known codec profile when allocating buffers

  • frontends/omx/h265: Check the pps set before the scaling data

Lepton Wu (1):

  • util/ralloc: fix ralloc alignment.

Lionel Landwerlin (36):

  • anv: fix incorrect realloc failure handling

  • intel/dump_gpu: only write BOs mapped by the driver

  • intel/dump_gpu: further track mapping of BOs

  • intel/dump_gpu: set default device_override

  • intel/dump_gpu: add an only-capture option

  • intel/dump_gpu: only map in GTT buffers not previously mapped

  • anv: track the current frame and write it into the driver identifier BO

  • intel/dump_gpu: fix –platform option

  • intel/dump_gpu: add an option to capture a single frame

  • anv: centralize vk to gen arrays

  • anv: fix up dynamic clip emission

  • anv: don’t fail userspace relocation with perf queries

  • intel/perf: store query symbol name

  • intel/perf: fix raw query kernel metric selection

  • anv: fix transform feedback surface size

  • anv: move push constant allocation tracking into gfx pipeline state

  • anv: simplify push constant emissions

  • anv: VK_INTEL_performance_query interaction with VK_EXT_private_data

  • anv: fix robust buffer access

  • include/drm-uapi: bump headers

  • anv: add new gem/drm helpers

  • anv: implement shareable timeline semaphores

  • intel/genxml: make sure test assert are compiled in

  • intel/compiler: fixup Gen12 workaround for array sizes

  • vulkan: bump headers/registry to 1.2.154

  • anv: implement VK_KHR_copy_commands2

  • intel/perf: fix crash when no perf queries are supported

  • intel/dev: add a small non installable tool to print device info

  • intel/dev: fix 32bit build issue

  • genxml: drop gen10

  • blorp: identify copy kernels in NIR

  • blorp: allow blits with floating point source layers

  • anv: fix source/destination layers for 3D blits

  • anv: report latest extension spec versions

  • intel/dev: Bump Max EU per subslice/dualsubslice

  • anv: fix descriptor pool leak in VMA object

Louis Li (1):

  • radeon/radeon_vce: fix out of target bitrate in CBR mode (H.264)

Louis-Francis Ratté-Boulianne (6):

  • st/mesa: factor ucp-lowering logic into helper

  • st/mesa: Enable clip planes lowering for geometry shaders

  • pipebuffer: Remove unused buffer event in slab bufmgr

  • st/mesa: Replace UsesStreams by ActiveStreamMask for GS

  • glsl/linker: Add support for XFB varying lowering in geometry shader

  • gallium: Fix NIR validation when lowering polygon stipple

Lucas Stach (19):

  • etnaviv: stop leaking the dummy texure descriptor BO

  • gallium/dri: allow create image for formats that only support SV or RT binding

  • etnaviv: drm: fix BO refcount race

  • etnaviv: blt: properly program surface TS offset for clears

  • etnaviv: update headers from rnndb

  • etnaviv: tex_desc: fix TS compression enable

  • etnaviv: cosmetic etna_resource_alloc fixes

  • etnaviv: do proper cpu prep/fini when clearing allocated buffer

  • etnaviv: simplify etna_screen_bo_from_handle

  • etnaviv: pass correct layout to etna_resource_alloc for scanout resources

  • etnaviv: don’t import allocated scanout resources via from_handle

  • Revert “gallium/dri: fix dri2_from_planar for multiplanar images”

  • etnaviv: emit RA_EARLY_DEPTH on dirty ZSA

  • etnaviv: flush depth cache when changing depth config

  • etnaviv: update headers from rnndb

  • etnaviv: expose shader discard usage in etna_shader_variant

  • etnaviv: rework ZSA into a derived state

  • gallium: document convention for get_handle calls on multi-planar resources

  • etnaviv: fix disabling of INT filter for real

Lukas F. Hartmann (1):

  • etnaviv: Fix disabling early-z rejection on GC7000L (HALTI5)

Marcin Ślusarz (50):

  • intel/perf: fix calculation of used counter space

  • intel/perf: fix how pipeline stats are stored

  • intel/perf: streamline error handling in read_oa_samples_until

  • intel/perf: fix performance counters availability after glFinish

  • intel/perf: split load_oa_metrics

  • intel/perf: export performance counters sorted by [group|set] and name

  • glsl: fix crashes on out of bound matrix access using constant index

  • gitlab: ask for more detailed info about GPU

  • mesa: fix formatting of messages printed using _mesa_log

  • anv: refresh cached current batch bo after emitting some commands

  • iris: handle os_dupfd_cloexec failure

  • iris: verify color component width in convert_fast_clear_color

  • i965: verify format width in blorp_get_client_bo

  • intel/perf: don’t generate logically dead code

  • intel/compiler/test: use TEST_DEBUG env var consistently

  • intel/compiler: mark debug constant as const

  • intel/fs,vec4: remove unused assignments

  • intel: add INTEL_DEBUG=shaders

  • intel/fs: add hint how to get more info when shader validation fails

  • intel/compiler: match brw_compile_* declarations with their definitions

  • intel/compiler: use the same name for nir shaders in brw_compile_* functions

  • intel/compiler: move extern C functions out of namespace brw

  • intel/compiler: print dispatch width when shader fails to compile

  • intel/compiler: fix typo in a comment

  • anv: fix minor gen_ioctl(I915_PERF_IOCTL_CONFIG) error handling issue

  • intel/compiler: remove unused fs_validator::param_size

  • intel/compiler: initialize remaining fields of various classes

  • intel/tools: fix possible memory leak in the error path

  • intel/tools: handle ftell errors

  • intel/compiler: quiet Coverity warnings

  • intel/tools: fix possible randomly increased verbosity of error2aub

  • intel: add INTEL_DEBUG expected value in declaration

  • iris: drop likely/unlikely around INTEL_DEBUG

  • i965: drop likely/unlikely around INTEL_DEBUG

  • anv: drop likely/unlikely around INTEL_DEBUG

  • intel: drop likely/unlikely around INTEL_DEBUG

  • vulkan/wsi: fix possible random stalls in wsi_display_wait_for_event

  • intel/tools: fix invalid type in argument to printf

  • intel/genxml: don’t generate identical code for different branches

  • anv: always annotate memory returned from anv_gem_mmap

  • intel: remove dead code

  • i965: remove prototypes of not-existing functions

  • intel/compiler: use C++ template instead of preprocessor

  • intel/compiler: remove branch weight heuristic

  • intel/tools: allow –color option to be used without arg

  • anv: remove dead code from anv_create_cmd_buffer

  • intel/tools: handle some failures

  • intel/tools: refactor logging to be easier to follow by static analyzers

  • intel/tools: add missing new lines to few remaining fail_if users

  • nir: handle float atomics in copy propagation pass

Marek Olšák (278):

  • radeonsi: enable ETC2 hw acceleration on Raven2

  • ac/gpu_info: set num_tiles_pipes on gfx10+ too

  • Revert “radeonsi: honor a user-specified pitch on gfx10.3”

  • radeonsi: use correct wave size in gfx10_ngg_calculate_subgroup_info

  • radeonsi: use the same units for esgs_ring_size and ngg_emit_size

  • radeonsi: increase minimum NGG vertex count requirement per workgroup on gfx 10.3

  • radeonsi: fix applying the NGG minimum vertex count requirement

  • radeonsi: don’t count unusable vertices to the NGG LDS size

  • radeonsi: add a common function for getting the size of gs_ngg_scratch

  • radeonsi: remove the NGG hack decreasing LDS usage to deal with overflows

  • radeonsi: various fixes for gfx10.3

  • radeonsi: disable NGG culling on gfx10.3 because of hangs

  • radeonsi: fix compute-based culling with VERTEX_COUNTER_GDS_MODE == 1

  • compiler: add glsl_print_type

  • nir: remove nir_strip stub declaration

  • nir: handle load_input_vertex in nir_get_io_offset_src

  • nir: save IO semantics in lowered IO intrinsics

  • nir: gather all IO info from IO intrinsics

  • nir: update IO semantics in nir_io_add_const_offset_to_base

  • nir: print IO semantics (v2)

  • nir: properly identify texcoords for lowered IO in nir_lower_drawpixels

  • nir: add shader_info::io_lowered

  • nir: add interpolation qualifiers for color sysvals into shader_info

  • nir: generate lowered IO in nir_lower_passthrough_edgeflags

  • st/mesa: don’t pass NIR to draw module if IO is lowered

  • st/mesa: don’t generate NIR for ARB_vp/fp if NIR is not preferred

  • st/mesa: handle lowered IO in st_nir_assign_vs_in_locations

  • gallium/tgsi: add helper tgsi_get_interp_mode

  • radeonsi: fix tess levels coming as scalar arrays from SPIR-V

  • st/mesa: remove useless code for lowered IO in st_nir_assign_vs_in_locations

  • gallivm: fix build on LLVM 12 due to LLVMAddConstantPropagationPass removal

  • amd/registers: expose the canonicalize.py program as a function

  • amd/registers: sort registers by offset in json

  • amd/registers: add a script that generates json from kernel headers

  • amd/registers: add non-gfx10 register files generated from kernel headers

  • amd/registers: switch to new generated register definitions

  • nir: fix a bug in is_dual_slot in nir_io_add_const_offset_to_base

  • st/mesa: fix lowered IO - don’t call st_nir_assign_vs_in_locations twice

  • radeonsi: don’t crash if input_usage_mask is 0 for a VS input

  • radeonsi: get color interpolation info from shader_info

  • radeonsi: clean up code for loading VS inputs

  • ac/nir: handle all lowered IO intrinsics

  • radeonsi: lower IO intrinsics - complete rewrite of input/output scanning

  • radeonsi: remove in/out/uniform variables from NIR after lowering IO

  • radeonsi: don’t lower indirect IO in GLSL

  • radeonsi: don’t execute LDS stores for TCS outputs that are never read

  • radeonsi: simplify handling color interp modes in si_emit_spi_map

  • radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_selector::type)

  • radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_context::type)

  • radeonsi: change PIPE_SHADER to MESA_SHADER (debug flags)

  • radeonsi: change PIPE_SHADER to MESA_SHADER (si_compile_llvm)

  • radeonsi: change PIPE_SHADER to MESA_SHADER (si_get_shader_part)

  • radeonsi: remove unused si_shader_context::type

  • radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_dump_disassembly)

  • radeonsi: precompute si_*_descriptors_idx in si_shader_selector

  • radeonsi: change PIPE_SHADER to MESA_SHADER (si_dump_descriptors)

  • radeonsi: remove si_shader_selector::type

  • compiler: add INTERP_MODE_COLOR for radeonsi

  • radeonsi: replace TGSI_INTERPOLATE with INTERP_MODE

  • radeonsi: replace TGSI_SEMANTIC with VARYING_SLOT and FRAG_RESULT

  • radeonsi: optimize out the loop in si_get_ps_input_cntl

  • ac/llvm: fix unaligned VS input loads on gfx10.3

  • nir: get ffma support from NIR options for nir_lower_flrp

  • nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes

  • nir/algebraic: add 16-bit versions of a few 32-bit patterns

  • glsl_to_nir: fix crashes with int16 shifts

  • radeonsi: remove redundant no-signed-zero-fp-math LLVM attribute

  • radeonsi: move nir_shader_compiler_options into si_screen

  • Revert “ac: generate FMA for inexact instructions for radeonsi”

  • ac/llvm: remove stub prototype for fmed3

  • ac/llvm: fix amdgcn.rcp for v2f16

  • ac/llvm: fix amdgcn.fract for v2f16

  • ac/llvm: fix amdgcn.rsq for v2f16

  • ac/llvm: fix bcsel for v2*16

  • ac/llvm: remove dead code handling for fmod

  • ac/llvm: add better code for isign

  • ac/llvm: add better code for fsign

  • ac/llvm: fix b2f for v2f16

  • radeonsi: stop using TGSI_PROPERTY_NEXT_SHADER

  • radeonsi: stop using TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION / VS_BLIT_SGPRS_AMD

  • radeonsi: stop using TGSI_PROPERTY_TCS_VERTICES_OUT

  • radeonsi: stop using TGSI_PROPERTY_TES_POINT_MODE / TES_PRIM_MODE

  • radeonsi: stop using TGSI_PROPERTY_TES_SPACING

  • radeonsi: stop using TGSI_PROPERTY_TES_VERTEX_ORDER_CW

  • radeonsi: stop using TGSI_PROPERTY_GS_*

  • radeonsi: stop using TGSI_PROPERTY_CS_*

  • radeonsi: stop using TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL

  • radeonsi: stop using TGSI_PROPERTY_FS_POST_DEPTH_COVERAGE

  • radeonsi: stop using TGSI_PROPERTY_FS_COORD_PIXEL_CENTER

  • radeonsi: stop using TGSI_PROPERTY_FS_DEPTH_LAYOUT

  • radeonsi: stop using TGSI_PROPERTY_CS_LOCAL_SIZE

  • radeonsi: stop using TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS

  • radeonsi: remove info::samplers_declared, image_buffers, msaa_images_declared

  • radeonsi: remove redundant si_shader_info::shader_buffers_declared

  • radeonsi: remove redundant si_shader_info::images_declared

  • radeonsi: remove redundant si_shader_info::const_buffers_declared

  • radeonsi: remove redundant si_shader_info:*(clip|cull)* fields

  • radeonsi: remove unused si_shader_info::uses_(vertexid|basevertex)

  • radeonsi: merge uses_persp_opcode_interp_sample/uses_linear_opcode_interp_sample

  • radeonsi: remove redundant si_shader_info::uses_kill

  • radeonsi: reduce type sizes in si_shader_selector

  • radeonsi: rename num_memory_instructions -> num_memory_stores

  • radeonsi: remove redundant si_shader_info::writes_memory

  • radeonsi: remove redundant GS variables in si_shader_selector

  • radeonsi: remove redundant si_shader_selector::max_gs_stream

  • radeonsi: remove redundant si_shader_info::uses_derivatives

  • radeonsi: use shader_info::cs::local_size_variable to clean up some code

  • radeonsi: deduplicate setting key.mono.u.vs_export_prim_id

  • radeonsi: kill point size VS output if it’s not used by the rasterizer

  • radeonsi: set outputs_written_before_ps for geometry shaders too

  • radeonsi: eliminate unused shader outputs for separate NGG geometry shaders

  • radeonsi: remove swizzle == ~0 dead code in si_llvm_load_input_gs

  • ac,radeonsi: lower 64-bit IO to 32 bits and remove all dead code

  • radeonsi: inline trivial PS functions

  • nir: add mediump flag to IO semantics

  • nir: fix lower_mediump_outputs to not require variables

  • nir/algebraic: add flrp patterns for 16 and 64 bits

  • nir/algebraic: expand existing 32-bit patterns to all bit sizes using loops

  • nir: remove redundant opcode u2ump

  • nir: enforce 32-bit src type requirement for f2fmp and i2imp

  • nir: add new mediump opcodes f2[ui]mp, i2fmp, u2fmp

  • nir/algebraic: collapse conversion opcodes (many patterns)

  • nir/algebraic: add late optimizations that optimize out mediump conversions (v3)

  • nir/opt_vectorize: don’t lose exact and no_*_wrap flags

  • st/mesa: don’t enable NV_copy_depth_to_color if NIR doesn’t support FP64

  • nir,radeonsi: move ffma fusing to late optimizations for better codegen

  • radeonsi: clean up ffma handling

  • Revert “radeonsi: set BIG_PAGE fields on gfx10.3”

  • Revert “radeonsi: move L2_CACHE_CONTROL registers into si_emit_framebuffer_state”

  • radeonsi: don’t lower pack for better 16-bit vectorization

  • radeonsi: set flags for FP16 in shaders

  • radeonsi: implement 16-bit FS color outputs

  • radeonsi: vectorize IO for better ALU vectorization

  • radeonsi: don’t scalarize 16-bit vec2 ALU opcodes

  • radeonsi: add 16-bit ALU vectorization

  • gallium: rename PIPE_TRANSFER_* -> PIPE_MAP_*

  • gallium: rename pipe_transfer_usage -> pipe_map_flags

  • gallium: rename transfer flags -> map flags in comments

  • radeon: rename RADEON_TRANSFER_* -> RADEON_MAP_*

  • radeonsi: set TRUNC_COORD=0 for Total War: WARHAMMER to fix it

  • radeonsi: move debug options from si_disk_cache_create to si_get_ir_cache_key

  • radeonsi: remove KILL_PS_INF_INTERP/CLAMP_DIV_BY_ZERO, use screen::options

  • amd: add Dimgrey Cavefish support

  • amd: add VanGogh support

  • radeonsi: set KEEP_TOGETHER_ENABLE if needed

  • radeonsi: move binning parameters into si_screen

  • radeonsi: break a binning batch on a new PS if bins can use multiple state sets

  • radeonsi: add a tweak for PS wave CU utilization for gfx10.3

  • nir: split fuse_ffma into fuse_ffma16/32/64

  • nir: split lower_ffma into lower_ffma16/32/64

  • radeonsi: fuse or lower ffma optimally on all chips

  • nir: replace lower_ffma and fuse_ffma with has_ffma

  • radeonsi: use optimal order of operations when setting up a compute dispatch

  • radeonsi: call si_upload_graphics_shader_descriptors before the big conditional

  • radeonsi: move a displaced comment in si_draw_vbo

  • radeonsi: don’t call emit_cache_flush after uploading bindless descriptors

  • radeonsi: reorganize the code around the gfx9 scissor bug

  • radeonsi: move si_upload_vertex_buffer_descriptors into si_state_draw.c

  • radeonsi: add unlikely statements into si_draw_vbo

  • radeonsi: lift the conditional for skipping si_upload_vertex_buffer_descriptors

  • radeonsi: always inline draw-related functions that have only one use

  • nir: gather indirect info from lowered IO intrinsics

  • nir: gather tess.tcs_cross_invocation info from lowered IO intrinsics

  • nir: set system_values_read for all intrinsics

  • nir: gather fs.uses_sample_qualifier from lowered IO

  • nir: fix input/output info gathering for lowered IO

  • nir: gather information about fbfetch and dual source color

  • radeonsi: fix indirect dispatches with variable block sizes

  • radeonsi: call nir_shader_gather_info after lowering and optimizing NIR

  • radeonsi: use info.system_values_read

  • radeonsi: get information about FS color outputs from shader_info directly

  • radeonsi: get input/output usage flags from shader_info directly

  • radeonsi: run NIR optimizations that glsl_to_nir runs but other places might not

  • radeonsi: assume that constant load_local_group_size has been optimized out

  • radeonsi: remove redundant variables from struct si_compute

  • radeonsi: remove redundant info.uses_fbfetch

  • gallivm: add support for lowered IO in vertex shaders

  • util: implement f16c - fast half<->float conversions

  • util: move util_half_to_float code into _mesa_half_to_float_slow

  • util: remove util_float_to_half and util_half_to_float wrappers

  • gallium/util: remove redundant util_float_to_half_rtz

  • gallium/util: remove empty file u_half.h

  • radeonsi: Fix dead lock with aux_context_lock in si_screen_clear_buffer.

  • radeonsi: simplify NGG culling enablement and add radeonsi_shader_culling option

  • radeonsi: kill disabled clip distances and planes at per-channel granularity

  • radeonsi: move si_set_active_descriptors_for_shader into si_update_common_shader_state

  • radeonsi: use staging buffer uploads for most VRAM buffers

  • radeonsi: call nir_lower_bool_to_int32 last because it breaks nir_opt_if

  • radeonsi: restructure si_pipe_set_constant_buffer

  • mesa: factor out layout parsing for glInterleavedArrays

  • gl_marshal.py: inline print_sync_dispatch

  • driconf: force the vendor string to NVIDIA to fix viewperf energy tests

  • driconf: enable force_glsl_extensions_warn for viewperf

  • st/mesa: enable GL name reuse for queries based on the driconf option

  • util/idalloc: resize if ID is too large for reservation

  • gallium/util: add set_frontend_noop into driver_noop and u_threaded_context

  • radeonsi: remove dead variable postponed_kill

  • radeonsi: implement GL_INTEL_blackhole_render

  • gallium/u_threaded_context: don’t call memcpy in tc_set_constant_buffer

  • gallium/u_threaded_context: always flush asynchronously if requested

  • gallium/u_threaded_context: fix use-after-free in transfer_unmap

  • util: implement F16C using inline assembly on x86_64

  • util: move util_half_to_float code into _mesa_half_to_float_slow

  • util: remove util_float_to_half and util_half_to_float wrappers

  • gallium/util: remove redundant util_float_to_half_rtz

  • gallium/util: remove empty file u_half.h

  • mesa: don’t use GET_DISPATCH because it doesn’t work with glthread

  • mesa: remove api_loopback to remove call indirections

  • glthread: handle glInterleavedArrays

  • nir/algebraic: always lower idiv to shifts if bitops are allowed

  • util: add _mesa_set_create_u32_keys where keys are not pointers

  • nir: add new helper passes that lower uniforms to literals

  • gallium: add pipe_context::set_inlinable_constants

  • st/mesa: pass inlinable uniforms to drivers if they requested it

  • ac/surface: fix valgrind warnings in DCC retile tile lookups

  • winsys/amdgpu: rework the VM alignment optimizations

  • winsys/amdgpu: apply the VM alignment optimization to the physical alignment too

  • radeonsi: update the DMA perf test

  • radeonsi: disable SDMA on gfx6-7 and gfx10.3 to decrease CPU overhead

  • Revert “radeonsi/gfx10: disable vertex grouping”

  • radeonsi: don’t disable NGG culling on gfx10.3

  • radeonsi: enable NGG culling by default on gfx10.3 dGPUs

  • radeonsi: optimize out LDS bank conflicts in the NGG culling shader

  • radeonsi: remove indirection when loading position at the end for NGG culling

  • radeonsi: write VS/TES system values into LDS after culling

  • radeonsi: pack LDS better for NGG culling

  • radeonsi: tweak LATE_ALLOC_GS numbers for faster NGG culling

  • radeonsi: enable NGG on Navi14 PRO cards

  • radeonsi: enable NGG culling by default on Navi1x PRO cards

  • ac/llvm: don’t lower bool to int32, switch to native i1 bool

  • amd: update addrlib

  • nir: consider load_color intrinsics as both inputs and sysval in gathering

  • Revert “st/mesa: don’t pass NIR to draw module if IO is lowered”

  • st/mesa: make sure prog->info is up to date for NIR (v2)

  • amd: regenerate gfx103.json from kernel headers

  • amd: correct typos in gfx10-rsrc.json

  • amd: update gfx10-rsrc.json for gfx10.3

  • amd: replace 0x028848 with the register definition

  • amd: print NUM_PKRS with AMD_DEBUG=info on gfx10.3

  • Revert “radeonsi: use staging buffer uploads for most VRAM buffers”

  • util: remove unused util_get_L3_for_pinned_thread

  • util: consolidate thread_get_time functions

  • st/mesa: remove random L3 pinning heuristic for glthread

  • util: add util_set_thread_affinity helpers including Windows support

  • util: add util_get_current_cpu using sched_getcpu and Windows equivalent

  • util: completely rewrite and do AMD Zen L3 cache pinning correctly

  • glthread: pin driver threads to the same L3 as the main thread regularly

  • radeonsi: implement inlinable uniforms

  • gallium: move pipe_draw_info::start/count to the beginning and pad empty space

  • gallium: add pipe_context::multi_draw

  • winsys/amdgpu: remove incorrect assertion check against max_check_space_size

  • radeonsi: add num_draws parameter into si_need_gfx_cs_space

  • radeonsi don’t get count from pipe_draw_info in si_num_prims_for_vertices

  • radeonsi: don’t check info->count == 0

  • radeonsi: implement multi_draw but supporting only 1 draw

  • radeonsi: add support for multi draws

  • radeonsi: set NOT_EOP for back-to-back draws on gfx10+

  • radeonsi: implement multi_draw for compute-based primitive culling

  • gallium/u_threaded: move a structure up to be used later

  • gallium/u_threaded: merge consecutive draw calls within batches

  • st/mesa: fix use-after-free when updating shader info in st_link_nir

  • radeonsi: fix min_direct_count value

  • radeonsi: do VGT_FLUSH when switching NGG -> legacy on Sienna Cichlid

  • radeonsi: only do VGT_FLUSH for fast launch if previous draw was normal launch

  • radeonsi: determine correctly if switching from normal launch to fast launch

  • radeonsi: add options.inline_uniforms to the shader cache key

  • ac: fix detection of Pro graphics

  • ac: fix min/max_good_num_cu_per_sa on gfx10.3 with disabled SEs

  • radeonsi: fix NGG streamout regression

  • radeonsi: fix scan_instruction for bindless inc_wrap/dec_wrap atomics

  • nir: fix gathering TCS cross invocation access with lowered IO

  • nir: fix gathering patch IO usage with lowered IO

  • ac/nir: fix a typo in ac_are_tessfactors_def_in_all_invocs

  • mesa: call FLUSH_VERTICES before changing sampler uniforms

  • st/mesa: fix uninitialized/random clip plane state vars in lower_ucp

  • radeonsi: fix a memory leak in si_create_dcc_retile_cs

  • radeonsi: fix a nasty bug in si_pm4.c

  • radeonsi: disable WGP mode on gfx10.3 to prevent hangs

Marek Vasut (2):

  • etnaviv: Remove etna_resource_get_status()

  • etnaviv: Add lock around pending_ctx

Marijn Suijten (5):

  • util: Makefile.sources: Add disk_cache_os.{c,h}

  • android: gallium/auxiliary: Deduplicate nir_to_tgsi.c inclusion

  • scons: gallium/auxiliary: Unconditionally compile NIR regardless of LLVM

  • android: panfrost: Move nir_undef_to_zero to util

  • android: freedreno: Add freedreno_dev_info.[ch] to Makefile.sources

Mark Janes (2):

  • intel/fs: Assert if lower_source_modifiers converts 32x16 to 32x32 multiplication

  • intel/fs: work around gen12 lower-precision source modifier limitation

Mark Menzynski (5):

  • nv50/ir: Use a bit field in info_out structure

  • nv50/ir: Add nv50_ir_prog_info_out serialize and deserialize

  • nv50/ir: Add prog_info_out print

  • nv50/ir: Add nv50_ir_prog_info serialize

  • nvc0: Add shader disk caching

Martin Peres (11):

  • driconf: bump the maximum string size from 25 to 1024

  • driconf: initialize the option value before using it

  • dri/DRI2ConfigQueryExtension: add support for string options

  • glx/extensions: split set_glx_extension into find_ and set_

  • glx: stop using hardcoded array sizes for bitfields

  • glx: initial plumbing to let users force-enable/disable extensions

  • glx: let users force-enable/disable indirect GL extensions

  • driconf: add a way to override GLX extensions

  • driconf: add a way to override indirect-GL extensions

  • driconf: disable GLX_OML_swap_method by default on Brink

  • driconf: allow higher compat version for Brink

Matt Turner (3):

  • intel/tools: Disassemble WAIT’s argument as a destination

  • Revert F16C series (MR 6774)

  • glcpp: Handle bison-3.6 error message changes

Mauro Rossi (28):

  • android: panfrost: Rename encoder/ to lib/

  • android: panfrost: Move pandecode into lib/

  • android: pan/mdg: Separate disassembler and compiler targets

  • android: pan/bi: Separate disasm/compiler targets

  • android: panfrost: Redirect cmdstream includes through GenXML

  • android: panfrost/bifrost: add libpanfrost_lib static dependency

  • android: panfrost: Redirect cmdstream includes through GenXML (v2)

  • android: util/format: fix generated sources rules

  • android: amd/registers: switch to new generated register definitions

  • android: util: fix missing include path

  • android: nv50/ir: Add nv50_ir_prog_info_out serialize and deserialize

  • android: freedreno: Implement pipe screen’s get_device/driver_uuid()

  • android: freedreno/common: add libmesa_git_sha1 static dependency

  • egl/android: HAVE_DRM_GRALLOC path fixes (v2)

  • android: aco/isel: Move context initialization code to a dedicated file

  • android: pan/bi: Use new disassembler

  • android: pan/bi: Use new packing

  • android: pan/bi: fix typo in bifrost_gen_disasm.c gen rules

  • android: gallium/iris: cleanup iris_driinfo.h gen rules

  • android: gallium/radeonsi: cleanup si_driinfo.h gen rules

  • android: gallium/virgl: cleanup virgl_driinfo.h gen rules

  • android: util: add log.c to Makefile.sources

  • android: pan/bi: Use new disassembler (v2)

  • android: panfrost: use python3 for generated sources rules

  • android: util: Move xxd.py to util

  • android: util,ac,aco,radv: Cross-platform memstream API

  • android: fix libsync dependencies (v2)

  • android: aco: add aco_form_hard_clauses.cpp to Makefile.sources

Michael Olbrich (1):

  • meson.build: xxf86vm is not needed for -Dglx-direct=false

Michael Tretter (2):

  • etnaviv: fix comment for source of etna_mesa_debug

  • etnaviv: free tgsi tokens when shader state is deleted

Michel Dänzer (31):

  • ci: Fix up rules for post-merge / main project branch pipelines

  • ci: Create test-docs job in mesa/mesa pipelines for MRs

  • ci: Don’t exclude “success” job from mesa/mesa pipelines for MRs

  • ci: Restrict “success” job to pipelines for MRs

  • ci: Do not create manual test-docs job in post-merge pipelines

  • ci: Remove any existing results directory before running piglit

  • ci: Add “is scheduled pipeline” YAML anchor

  • ci: Add “is master branch of main project” YAML anchor

  • ci: Add “is pre-merge pipeline for Marge Bot” YAML anchor

  • ci: Add “is post-merge pipeline, not for Marge Bot” YAML anchor

  • ci: Add “is forked branch or pre-merge pipeline” YAML anchor

  • ci: Add “is forked branch” YAML anchor

  • ci: Add “is post-merge pipeline” YAML anchor

  • ci: Add “is pre-merge pipeline” YAML anchor

  • ci: Add “is for Marge Bot” YAML anchor

  • ci: Always use CI_PROJECT_NAMESPACE instead of CI_PROJECT_PATH

  • ci: Prevent pages job from running in pre-merge pipelines

  • ci: Don’t create test-docs job if the pages one exists in the pipeline

  • ci: Use ignore_scheduled_pipelines anchor in .radeonsi-rules

  • gallium: Make pipe_viewport_state swizzle_x/y/z/w bit-fields 8 bits wide

  • ci: Move test-docs job to deploy stage

  • ci: Add empty needs: to pages job

  • ci: Add jobs running ci-fairy checks

  • loader/dri3: Only allocate additional buffers if needed

  • loader/dri3: Keep current number of back buffers if frame was skipped

  • loader/dri3: Allocate up to 4 back buffers for page flips

  • ci: Add “check mr” job to needs: of build jobs

  • ci: Run git_archive job if all_paths matches

  • i965/bufmgr: Handle NULL bufmgr in brw_bufmgr_get_for_fd

  • iris/bufmgr: Handle NULL bufmgr in iris_bufmgr_get_for_fd

  • ac: Don’t negate strstr return values in ac_query_gpu_info

Michel Zou (9):

  • swr: fix build with mingw

  • swr: missing _BitScanForward64 on 32 bits win

  • swr: fix _BitScanForward64 on unix

  • util: drop non-posix header fnmatch

  • lavapipe: fix usleep usage in lvp_device

  • wsi: move drm code to wsi_common_drm.c

  • gallium: use libpipe_loader_links

  • lavapipe: configure suffix in icd json

  • util: use dllexport for mingw too

Mike Blumenkrantz (118):

  • zink: basic primitive restart support for strip/fan topologies

  • zink: move 8bit index handling out of u_primconvert path

  • zink: use util_draw_vbo_without_prim_restart for unsupported prim modes

  • zink: set primitive restart cap

  • zink: move shader state methods for pipe_context into zink_program.c

  • zink: adjust zink_shader struct to contain full streamout info

  • zink: refcount zink_gfx_program objects

  • zink: split up creating zink_shader objects and VkShaderModule objects

  • zink: use ZINK_SHADER_COUNT instead of PIPE_SHADER_TYPES - 1 everywhere

  • zink: start using per-stage flags for new shaders, refcount shader modules

  • zink: always compile shaders in pipeline order

  • zink: rename zink_gfx_program::stages to ‘modules’

  • gallium: add pipe_transfer_usage for z/s only mappings

  • gallium/u_transfer_helper: add util functions for doing deinterleaving during map

  • zink: print error when getprocaddr fails for extension functions

  • zink: change pipeline hashes to index based on vk primitive type

  • zink: handle more draw modes

  • zink: invalidate pipeline hash on more changes

  • zink: use u_transfer_helper to split/merge interleaved depth/stencil formats

  • zink: add note about buffer<->image copy functions not handling multisample

  • zink: generically handle matrix types

  • anv: improve error message when failing to open device path

  • anv: assert that the target bo is valid when adding a reloc list

  • zink: use correct value for color buffer sample count when creating renderpass

  • zink: use correct number of samples on framebuffer in set_framebuffer_state

  • zink: use correct layer count when creating framebuffer

  • zink: clamp min created fb size to 1x1

  • zink: verify that src and dst aspects are the same in resource_copy_region hook

  • zink: implement ARB_instanced_arrays

  • zink: move viewport count to zink_gfx_pipeline_state

  • zink: set multiviewport cap in ntv when gl_ViewportIndex is a written output

  • zink: correctly set up fb-sized scissors for each viewport

  • zink: apply viewport count when creating pipelines

  • zink: reorder create_stream_output_target to fix failure case leak

  • zink: combine all surface layout-setting for src/dst into util function

  • zink: unify all occurrences of waiting on a fence

  • zink: correctly handle ARB_arrays_of_arrays in ntv for samplers

  • zink: run nir_lower_uniforms_to_ubo conditionally

  • zink: fix shader buffer size caps to use 65536

  • zink: always emit descriptor set 0 in ntv

  • zink: emit ubo variables sized based on the overall ubo block size

  • zink: don’t emit ubos or bindings for ubo variables

  • zink: correctly set up ubo bindings and buffer indices

  • zink: use sizeof(vec4) multiplier for nir_lower_uniforms_to_ubo

  • zink: hook up driconf

  • xmlconfig: fix scandir_filter

  • zink: handle timestamp queries

  • zink: handle TIME_ELAPSED queries

  • zink: add pipe_context::get_timestamp hook

  • zink: enable pipe caps for ARB_timer_query

  • anv: remove VkPipelineCacheCreateInfo::flags assert

  • radv: remove VkPipelineCacheCreateInfo::flags assert

  • util/hash_table: add function for reserving size in a hash table

  • zink: enable VK_KHR_vulkan_memory_model extension

  • zink: add VK_EXT_custom_border_color

  • zink: support VK_EXT_blend_operation_advanced

  • zink: support VK_EXT_extended_dynamic_state

  • zink: add VK_EXT_pipeline_creation_cache_control

  • zink: enable VK_EXT_shader_stencil_export

  • zink: ARB_uniform_buffer_object is now implemented, so add cap and feature doc

  • glsl: fix up location setting for variables pointing to a UBO’s base

  • nir: update ubo locations in nir_lower_uniforms_to_ubo

  • zink: add a mechanism to track current resource usage in batches

  • zink: optimize transfer_map for resources with pending reads/writes

  • zink: add more explicit fencing for transfer maps

  • zink: explicitly flag fb attachments as being written to in render passes

  • zink: don’t leak sampler view textures

  • zink: redo slot mapping again for the last time really I mean it

  • zink: export PIPE_CAP_MAX*_VARYINGS values

  • zink: unify code for emitting named uint-based variable instructions

  • glsl: more accurately handle swizzle in 64bit varying split with no left value

  • zink: increase descriptor pool sizes for other descriptor types we’ll be using

  • zink: implement ARB_texture_buffer_object

  • zink: ensure resource tracking for sampler buffers in render batches

  • zink: assert valid format in zink_create_sampler_view()

  • zink: handle null attachment for ARB_texture_buffer_object samplers

  • zink: add VK_BUFFER_USAGE_INDEX_BUFFER_BIT to vertex buffer creation

  • zink: add last few format maps for ARB_vertex_type_2_10_10_10_rev

  • zink: fix stencil wrapping

  • zink: add some spirv_builder functions we’ll be using for geometry shaders

  • zink: handle shader io vars more generically for use with gs

  • zink: add ntv handling for geometry shader variables

  • zink: re-transform gl_Position for gs input

  • zink: add handling for gs in ntv

  • zink: remove ADJACENCY prim types from primconvert path

  • zink: round out handling for streamout buffer stride setting during draw

  • zink: add gallium handling for geometry shaders

  • zink: enable gs pipe caps

  • zink: bump to glsl 1.40

  • zink: mark off GL 3.1 as done in features.txt

  • zink: GLSL 1.50

  • zink: set 3.2 complete in features.txt

  • zink: bump GLSL to 3.30

  • zink: set 3.3 complete in features.txt

  • zink: implement ARB_draw_indirect

  • zink: add helper for vec-type input variables in ntv

  • zink: add ntv handling for ARB_sample_shading

  • zink: add a pipe_context::get_sample_position hook

  • zink: mark ARB_sample_shading as supported

  • doc/features: remove zink entries for GL 3.3 items

  • zink: deduplicate some query result code

  • zink: more correctly handle PIPE_QUERY_PRIMITIVES_GENERATED queries

  • zink: also create an xfb query for every primitives generated query

  • zink: store batch id onto query object at time of start

  • zink: fixup gs/xfb tracking for primitives generated queries

  • zink: rework query overflow handling

  • zink: always use query->type for starting/stopping xfb queries

  • zink: always reset query pools on next query begin

  • zink: add pass for lowering dynamic ubo/ssbo block indexing to constants

  • zink: break up dynamic access lowering

  • util/threaded_context: use driver’s buffer alignment for staging transfers

  • nir/clip_disable: write 0s instead of undefs for disabled clip planes

  • nir/clip_disable: try for better no-op

  • nir/clip_disable: handle 2x vec4 case

  • zink: implement ARB_texture_query_lod

  • zink: use same function for all pipe_context::delete_*_state shader methods

  • zink: add a quadop function in spirv_builder

  • zink: add some spirv builder functions for barriers

Nanley Chery (46):

  • dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_B8G8R8X8_UNORM

  • iris: Don’t call SET_TILING for dmabuf imports

  • iris: Make iris_bo_import_dmabuf take a modifier

  • iris: Drop iris_resource_alloc_separate_aux

  • iris: Drop unused resource allocation optimization

  • iris: Drop old comment on clear color BO allocation

  • iris: Move size/offset calculations out of configure_aux

  • iris: Add and use iris_resource_configure_main

  • iris: Drop buffer support in resource_from_handle

  • gallium/dri2: Report correct YUYV and UYVY plane count

  • iris: Fix aux assertion in resource_get_handle

  • iris: Fold a condition into no_gpu for consistency

  • iris: Make iris_has_color_unresolved more generic

  • iris: Avoid resolving Z/S reads in transfer_map

  • iris: Drop a use of the need_resolve boolean

  • iris: Better determine map_would_stall for Z/S

  • gallium/dri2: Report I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS num_planes

  • gallium/dri2: Support I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS import

  • intel/isl: Describe I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS

  • intel/isl: Support ISL_AUX_USAGE_MC in surface states

  • intel/isl: Add YUV format info for the aux-map

  • st/mesa: Don’t map all P01X DRM formats to P016

  • intel/common: Add get_aux_map_format_bits()

  • iris: Support planar resource imports for MC

  • intel/common: Drop unused gen_aux_map_add_image

  • iris: Support MC modifier in plane count queries

  • iris: Support I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS

  • blorp: Fix alignment test for HIZ_CCS_WT fast-clears

  • blorp: Drop trailing whitespace in blorp_clear.c

  • anv/image: Disable multi-layer CCS_E on TGL+

  • blorp: Ensure aligned HIZ_CCS_WT partial clears

  • iris: Fix a fast-clear skipping optimization

  • anv: Enable multi-layer aux-map init for HIZ+CCS

  • Revert “anv: Add driconf option to disable compression for 16bpp format”

  • iris: Add fast-clear restriction for 8bpp surfaces

  • isl: Allow CCS for 8bpp surfaces with 3+ miplevels

  • st/mesa: Add missing sentinels in format_map[]

  • intel/isl: Drop redundant unpack of unorm channels

  • isl: Fix the aux-map encoding for D24_UNORM_X8

  • iris: Fix fast-clears of swizzled LA formats

  • iris: Fix SINT assert in convert_fast_clear_color

  • iris: Fix fast-clears of swizzled alpha formats

  • iris: Flush dmabufs during context flushes

  • mesa: Add and use _mesa_has_depth_float_channel

  • mesa: Clamp some depth values in glClearBufferfv

  • mesa: Clamp some depth values in glClearBufferfi

Neil Roberts (3):

  • v3d: Make the function to set tex dirty state for a stage global

  • v3d: Split the creating of TEXTURE_SHADER_STATE into a helper function

  • v3d: Update the TEXTURE_SHADER_STATE when there’s a new buf for a tex

Philipp Zabel (3):

  • meson: fix power8 option

  • gallium/dri: fix dri2_query_image for multiplanar images

  • gallium/dri: fix dri2_from_planar for multiplanar images

Pierre Moreau (5):

  • clover/spirv: Remove unused tuple header

  • clover/spirv: Print linked SPIR-V module if asked

  • meson: Raise minimum version for SPIR-V OpenCL deps (v4)

  • clover/llvm: Use the highest supported SPIR-V version (v4)

  • clover/nir: Register callback for translation messages (v2)

Pierre-Eric Pelloux-Prayer (61):

  • ac/llvm: handle static/shared llvm init separately

  • mesa/st: introduce PIPE_CAP_NO_CLIP_ON_COPY_TEX

  • radeonsi: enable PIPE_CAP_NO_CLIP_ON_COPY_TEX

  • ac/llvm: add option to clamp division by zero

  • radeonsi,driconf: add clamp_div_by_zero option

  • radeonsi: use radeonsi_clamp_div_by_zero for SPECviewperf13, Road Redemption

  • amd/llvm: switch to 3-spaces style

  • amd/common: switch to 3-spaces style

  • mesa: move u_idalloc from gallium/aux/util to util

  • util/idalloc: add util_idalloc_reserve

  • util/idalloc: add lowest_free_idx to avoid iterating from 0

  • mesa: add a isGenName parameter to _mesa_HashInsert

  • mesa: add GL name reuse support

  • mesa: add _mesa_HashFindFreeKeys

  • mesa: use _mesa_HashFindFreeKeys for GL functions

  • driconf: add option to reuse GL names

  • glsl: fix per_vertex_accumulator::fields size

  • r600/uvd: set dec->bs_ptr = NULL on unmap

  • radeon/vcn: set dec->bs_ptr = NULL on unmap

  • radeonsi: fix quant_mode selection for large negative values

  • radeonsi: fix guardband handling for large values

  • mesa: fix glUniform* when a struct contains a bindless sampler

  • gallium: add PIPE_CAP_MAX_TEXTURE_MB

  • radeonsi: move GL vendor workaround to drirc

  • radeonsi: reduce PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE value

  • radeonsi: change vendor name to AMD

  • radeonsi: force linear for textures with height=1 (gfx6-8)

  • radeonsi/tmz: use secure job if framebuffer has dcc

  • radeonsi/tmz: use secure job if using an encrypted z/s buffer

  • radeonsi/tmz: add safety assert when tmz is enabled

  • radeonsi/tmz: allocate depth/stencil buffers as encrypted

  • radeonsi: introduce SI_RESOURCE_FLAG_INTERNAL / RADEON_FLAG_DRIVER_INTERNAL

  • amd: add AMDGPU_IDS_FLAGS_TMZ definition to amdgpu_drm.h

  • ac/gpu_info: add detection of TMZ support

  • radeonsi/tmz: allow secure job if the app made a tmz allocation

  • amd/winsys: add RADEON_FLUSH_TOGGLE_SECURE_SUBMISSION

  • radeonsi/tmz: fail si_texture_transfer_map if tex is encrypted

  • radeonsi/tmz: add tmz variant of sctx::wait_mem_scratch

  • radeonsi/tmz: add tmz variant for sctx::tess_rings

  • radeonsi: disable primitive discard if tmz is in use

  • radeonsi/tmz: add a tmz variant for sctx::eop_bug_scratch

  • radeonsi/tmz: add workaround for mpv/vaapi subtitles

  • amd/tmz: move uses_secure_bos to radeon_winsys

  • gallium/vl: do not call transfer_unmap if transfer is NULL

  • gallium/vl: add chroma_format arg to vl_video_buffer functions

  • omx/tizonia: fix build

  • gallium: add new cap PIPE_CAP_DEVICE_PROTECTED_CONTENT

  • gallium: introduce PIPE_BIND_PROTECTED

  • radeonsi: honor PIPE_BIND_PROTECTED

  • egl: implement EGL_EXT_protected_surface support

  • radeonsi: enable PIPE_CAP_DEVICE_PROTECTED_CONTENT

  • egl: handle EGL_PROTECTED_CONTENT_EXT for eglImage

  • dri: introduce createImageFromDmaBufs3

  • egl/dri2: implement createImageFromDmaBufs3

  • driconf: add disable_protected_content_check option

  • radeonsi: fix RADEON_FLUSH flags conflicts

  • radeon: add si_vid_create_tmz_buffer helper

  • radeon/vcn: delay dec->ctx and dec->dpb allocation

  • va/picture: make sure destination buffer is protected if needed

  • va: support VA_RT_FORMAT_PROTECTED

  • radeonsi/gfx10: flush gfx cs on ngg -> legacy transition

Pierre-Loup A. Griffais (2):

  • radv: fix null descriptor for dynamic buffers

  • radv: fix vertex buffer null descriptors

Qiang Yu (4):

  • radeonsi: fix syncobj wait timeout

  • radeonsi: fix user fence space when MCBP is enabled

  • radeonsi: fix max syncobj wait timeout

  • radeonsi: fix user fence GPU address

Rhys Perry (160):

  • aco: fix C++11/C++14 compilation

  • aco: set constant_data_offset correctly in the case of merged shaders

  • aco: don’t move memory accesses to before control barriers

  • nir/opt_remove_phis: optimize out phis with undef

  • gitlab: ask inxi output to be in code blocks

  • util: add a alignof() macro

  • nir: fix potential left shift of a negative value

  • nir: fix memory leak in nir_cf_list_clone

  • radv: don’t pass null to _mesa_sha1_update

  • radv: align pipeline cache entry and header sizes

  • radv: fix null memcpy and zero-sized malloc

  • aco: fix non-rtz pack_half_2x16

  • nir: add and use nir_intrinsic_has_ helpers

  • aco: use nir_intrinsic_has_access

  • bifrost: use nir_intrinsic_has_type

  • aco: consider branch definitions in spiller

  • aco: don’t consider the first partial spill if it’s the wrong type

  • aco: don’t fix break condition for break+discard to exec

  • aco: fix regclass checks when fixing to vcc/exec with Builder

  • aco: fix spills_entry heuristic for branch blocks in init_live_in_vars()

  • aco: keep loop live-through variables spilled

  • aco: reserve 2 sgprs for each branch

  • aco: create long jumps

  • aco/tests: add test for GFX10 0x3f bug

  • aco: shorten disassembly for repeated instructions

  • aco/tests: add tests for long jumps

  • aco: remove 64-bit SGPR ubfe/ibfe

  • aco: fix sgpr ubfe/ibfe if the offset is too large

  • aco: sink get_alu_src() in bfe lowering

  • spirv: fix Uniform and Output MemoryAccessMakePointer{Visible,Available}

  • spirv: make OpLoad/OpStore visibility/availablity barriers acquire/release

  • spirv: add vtn_emit_make_{visible,available}_barrier helpers

  • spirv: implement MakePointerAvailable/MakePointerVisible for OpCopyMemory

  • spirv: implement Volatile memory semantic

  • spirv: implement Volatile image operand

  • spirv: implement SpvMemoryAccessVolatileMask

  • spirv: add some tests for volatile/available/visible

  • radv: remove descriptor_indexing fails from expected fails

  • aco: fix mad splitting after applying output modifiers

  • aco: remove omod_success/clamp_success

  • aco: fix byte_align_scalar for 3 dword vectors

  • nir/load_store_vectorize: rework alignment calculation

  • nir/opt_shrink_vectors: shrink image stores using the format

  • aco: fix one-off error in Operand(uint16_t)

  • aco: improve fsign selection

  • nir/opt_if: fix opt_if_merge when destination branch has a jump

  • nir/opt_loop_unroll: fix is_access_out_of_bounds with vectors

  • aco: fix v_writelane_b32 with two sgprs

  • aco: workaround disassembler bug of v_writelane_b32 with literal

  • aco: don’t apply constant to SDWA on GFX8

  • aco: fix value numbering of reductions

  • aco: fix validation of sub-dword parallel-copies

  • aco: pass -fno-exceptions and -fno-rtti

  • aco: fix incorrect assertion in emit_vop3a_instruction()

  • radv: initialize with expanded cmask if the destination layout needs it

  • radv,aco: fix reading primitive ID in FS after TES

  • aco: keep track of temporaries’ regclasses in the Program

  • aco: use bit vectors for liveness sets

  • aco: use io semantics to get an intrinsic’s slot

  • aco: use nir_get_io_offset_src() in visit_load_input()

  • aco: use nir’s constant source helpers more

  • aco: remove dead indirect fs input loading

  • aco: stop multiplying driver_location by 4

  • st/nir: call nir_opt_access before gl_nir_lower_buffers

  • radeonsi: don’t use nir_opt_access

  • nir/instr_set: hash intrinsic sources

  • nir/load_store_vectorize: improve vectorization with identical operations

  • aco: fix get_buffer_resource_flags()

  • aco: remove trailing whitespace

  • radv: remove trailing whitespace

  • aco: Add loop creation helpers.

  • nir: return progress from nir_lower_io_to_scalar_early

  • radv: move optimizations in shader_compile_to_nir() to after io_to_scalar

  • radv: use radv_optimize_nir() less in radv_link_shaders()

  • spirv: add and use a generator id enum

  • spirv: replace discard with demote for incorrect HLSL->SPIR-V translations

  • radv: remove RDR2 discard workaround

  • android: fix SPIR-V -> NIR build

  • aco: optimize more uniform reductions/scans

  • aco: implement elect

  • radv/aco,nir/lower_subgroups: don’t lower elect

  • nir: add last_invocation intrinsic

  • aco: implement last_invocation

  • nir: move divergence analysis options to nir_shader_compiler_options

  • nir: allow divergence information to be updated when inserting instruction

  • nir: add pass to optimize uniform atomics

  • aco: use nir_opt_uniform_atomics

  • nir/opt_uniform_atomics: optimize image atomics

  • nir/opt_uniform_atomics: don’t optimize atomics twice

  • aco: fix get_ssbo_size with a vgpr resource

  • scons: fix SPIR-V -> NIR build

  • nir/opt_uniform_atomics: remove useless returns

  • aco: implement 16-bit literals

  • aco: propagate literals into sub-dword pseudo instructions on GFX9+

  • aco: don’t use v_pack_b32_f16 if 16-bit input denormals are flushed

  • nir/opt_load_store_vectorize: don’t vectorize stores across demote

  • nir/opt_load_store_vectorize: add some tests for discard/demote behaviour

  • aco: add missing SCC clobber in get_buffer_size

  • ci: disable check commits job for now

  • nir/loop_analyze: adjust force unrolling to only include interesting modes

  • ac/nir: remove bindless image atomic format check

  • aco: remove isel_context::allocated

  • aco: update phi_map in add_subdword_operand()

  • aco: don’t do divergent break+discard

  • aco: skip value numbering of copies

  • aco: copy-propgate through p_create_vector during value numbering

  • aco: expand vectors passed as copy operands

  • aco: don’t use bld.copy() in handle_operands()

  • aco: allow literals on sub-dword p_parallelcopy

  • aco: always use p_parallelcopy for pre-RA copies

  • aco: use Builder::copy more

  • aco: remove some unused optimizations

  • aco: use v_mov_b32_sdwa for some 16-bit constants

  • aco: remove all-undef phi opt

  • aco: ignore the ACO-inserted continue in create_continue_phis()

  • aco: default to a definition size of 32

  • aco: round bytes_written to dwords if larger than 4 bytes

  • aco: use control flow creation helpers in select_gs_copy_shader

  • aco: use mubuf helper in select_gs_copy_shader

  • aco: move individual instruction disassembly to its own helper

  • aco: refactor repeated instruction disassembly

  • aco: switch aco_print_asm to a FILE \*

  • aco: create s_clause on GFX10+

  • aco: assert a label only uses one of the members in ssa_info’s union

  • aco: fix printing of some sdwa sels

  • aco: fix combine_inverse_comparison()

  • aco: don’t allow destination opsel for v_cvt_pknorm

  • aco: handle SDWA in the optimizer

  • docs/features: update unpromoted Vulkan extensions

  • docs/features: add Vulkan 1.2

  • radv: add some missing radv_{start,stop}_feedback

  • radv: fix shader caching with discard->demote workaround

  • radv: fix shader caching with NaN fixup workaround

  • nir: scalarize fdot in reverse

  • spirv: reverse order in matrix multiplication

  • nir/algebraic: better propagate constants up fadd chains

  • nir: add nir_alu_src_is_trivial_ssa()

  • nir: skip bcsel with non-trivial swizzle in opt_simplify_bcsel_of_phi()

  • nir: use nir_alu_src_is_trivial_ssa() in nir_ssa_for_alu_src()

  • nir: add shader_info::bit_sizes_used

  • nir/lower_bit_size: optimize upcast of b2i8/b2i16

  • radv: move a few passes to after load/store vectorization

  • radv: do nir_lower_bit_size after algebraic optimizations

  • radv: rework nir_lower_bit_size callback and run DA on GFX8+

  • aco: implement some 16-bit arithmetic instead of lowering

  • aco: implement 8/16-bit instructions which can be trivially widened

  • spirv: fix GLSLstd450Modf/GLSLstd450Frexp when the destination is vector

  • util: add mapping from Vulkan to Gallium R64 integer formats

  • amd/common: add PIPE_FORMAT_R64_{UINT,SINT} to GFX10 format table

  • aco: implement 64-bit images

  • ac/nir: implement 64-bit images

  • radv: implement VK_EXT_shader_image_atomic_int64

  • aco: don’t combine precise max(min()) to med3

  • aco: fix combine_constant_comparison_ordering() NaN check with 16/64-bit

  • aco: disallow various v_add_u32 opts if modifiers are used

  • aco: disable omod if the sign of zeros should be preserved

  • aco: fix fp16 *0.5 omod

  • aco: fix v_mul_hi_u32_u24 format

  • nir/unsigned_upper_bound: fix buffer overflow in search_phi_bcsel

  • nir: fix sampler_lod_parameters_pan indices

Ricardo Garcia (1):

  • anv: Ignore continue flag in primary cmd buffers

Ricardo Quesada (1):

  • anv: support fd==-1 in ImportSemaphoreFdKHR

Rob Clark (46):

  • freedreno/registers: add some missing regs to build

  • freedreno/ir3: don’t install ir3_compiler cmdline tool

  • freedreno/ir3: add tracking for \# of instructions per category

  • freedreno/ir3: add more disasm stats

  • freedreno/crashdec: handle section name typos

  • freedreno/decode: try harder to not crash in disasm

  • freedreno/registers: SC_WAIT_WC is not a6xx

  • freedreno/a6xx: only generate streamout for draw pass shader

  • freedreno/a6xx: fix occlusion query with more than one tile

  • freedreno/cffdump: add arg to filter by process name

  • freedreno/a6xx: disable LRZ when color channels are masked

  • freedreno/a6xx: refactor debug logging

  • freedreno: add debug helper to dump buffers

  • freedreno: handle case of shadowing current render target

  • freedreno/gmemtool: add tile_alignw/h and a650

  • freedreno: add env var to override GMEM size

  • freedreno: add env var to override tiles-per-pipe

  • freedreno/a6xx: fix hang with large render target

  • freedreno/batch: split out helper for rb alloc

  • freedreno/batch: replace lrz_clear with prologue

  • freedreno/a5xx+a6xx: use sysmem path for nondraw batches

  • freedreno/a6xx: move ubwc clear to blitter

  • freedreno: Fix missing rsc->seqno updates

  • freedreno: fence_server_sync() fixes

  • freedreno: Fix rast state for multisample clear

  • freedreno: Don’t bypass fd_draw_vbo() in clear fallback

  • freedreno/a6xx: Skip empty tile_setup

  • freedreno/a6xx: Fix fd6_draw_vbo() return

  • freedreno: Clear gs/tcs/tes state for clear blits

  • freedreno/a6xx: Fix MSAA clear

  • freedreno: fix fence-fd leak

  • ci/deqp-runner: Allow overriding width/height/config

  • ci: cherry-pick deqp fix for config choosing

  • ci: Enable remaining (non-rotate) mustpass CTS tests

  • freedreno/drm: drop bo’s dev reference

  • freedreno: Don’t leak border_color_buf reference

  • freedreno/a6xx: Small cleanup

  • freedreno/drm: Also clean ring_cache

  • freedreno/registers: Add a couple things used on kernel side

  • freedreno: Don’t leak LRZ bo’s

  • freedreno: Update import/export traces

  • freedreno: Disallow tiled if SHARED and not QCOM_COMPRESSED

  • freedreno: Rework GMEM limit init

  • freedreno/gmem: Respect max-height limits too

  • freedreno: Protect gmem_cache ralloc allocations

  • freedreno/ir3: Fix crash in shader compile fail path

Rohan Garg (3):

  • anv: Mark anv_dump_{start,finish} as PUBLIC

  • gitlab-ci: Test the traces from bgfx

  • virgl: Always enable emulated BGRA and swizzling unless specifically told not to

Roland Scheidegger (1):

  • gallivm: add InstSimplify pass

Roman Gilg (2):

  • vulkan/wsi/x11: add sent image counter

  • vulkan/wsi/x11: wait for acquirable images in FIFO mode

Roman Stratiienko (1):

  • android: freedreno: Another build fix

Ruijing Dong (1):

  • frontends/omx/enc: fix omx h264 encoding force-keyframe-period issue.

Ryan Neph (1):

  • virgl: Fixes portal2 binary name in tweak config

Sagar Ghuge (12):

  • intel/isl: Drop unnecessary check on 16bpp depth format

  • intel/blorp: Conditionally clear full surface depth and stencil

  • anv: Factor out dri option initialization code in separate function

  • anv: Add driconf option to disable compression for 16bpp format

  • anv: Return number of layers/levels attached to anv_image

  • anv: Handle compressed stencil buffer transition on Gen12+

  • anv: Set stencil_aux_usage flag

  • anv: Get aux usage from plane while clearing stencil buffer

  • anv: Don’t track clear bo for stencil buffer compression

  • anv: Return optimal aux state for stencil buffer compression

  • anv: Pass correct stencil aux usage during MSAA resolve

  • anv: Enable stencil buffer compression on Gen12+

Samuel Iglesias Gonsálvez (14):

  • freedreno/layout: add tile_all flag to the layout

  • turnip: add environment variable to disable LRZ

  • turnip: create LRZ buffer

  • turnip: disable LRZ on specific cases

  • turnip: disable LRZ writes when blend is enabled

  • turnip: disable LRZ depending on fragment changes

  • turnip: add LRZ tracking to command buffer state

  • turnip: add LRZ valid tracking for secondary command buffers

  • turnip: add support to clear LRZ

  • turnip: emit correct LRZ fast clear setup

  • turnip: disable LRZ on vkCmdClearAttachments()

  • turnip: disable LRZ on vkCmdClearattachments() 3D fallback path

  • turnip: enable LRZ

  • turnip: don’t initialize GRAS_LRZ_CNTL/RB_LRZ_CNTL tu6_init_hw()

Samuel Pitoiset (157):

  • radv: allow to force-enable LLVM internally for a specific shader stage

  • radv: report the spirv-nir logs back to the application

  • radv: rework the error function helpers a bit

  • radv: report errors back to the application via VK_EXT_debug_report

  • radv: report a better error message when QueueWaitIdle() failed

  • radv/gfx10: add missing initialization of registers

  • radv: limit LATE_ALLOC_GS to prevent a GPU hang on GFX10

  • radv: fix emitting the border color pointer on the compute queue

  • radv/winsys: add null winsys entries for Sienna Cichild/Navy Flounder

  • gitlab-ci: test Fossilize with GFX1030

  • aco: do not set valid_mask for POS0 exports on GFX 10.3

  • radv: track and report if a logical device is lost

  • aco: rename DEBUG_VALIDATE to DEBUG_VALIDATE_IR

  • aco: rework the way various compilation/validation errors are reported

  • radv,aco: report ACO errors/warnings back via VK_EXT_debug_report

  • aco: fix file leak in ra_fail()

  • radv: ignore BB labels when splitting the disassembly string

  • aco: add ACO_DEBUG=force-waitcnt to emit wait-states

  • amd/registers: add missing TBA registers on GFX6-GFX8

  • amd/registers: add some SQ_WAVE_* register definitions

  • aco: add TBA/TMA/TTMP0-11 physical registers definitions

  • aco: validate that SMEM operands can use fixed registers

  • aco: add a helper for building a trap handler shader

  • aco: skip unnecessary compiler pass for the trap handler program

  • radv: add a small interface for creating the trap handler shader

  • radv: add initial trap handler support with RADV_TRAP_HANDLER=1

  • radv: enable the trap handler and configure the shader exceptions

  • radv: use the trap handler to detect faulty shaders/instructions

  • radv: align the TMA BO size to 256

  • radv: allocate the TMA BO into 32-bit addr space

  • radv: fix setting EXCP_EN for different shader stages

  • radv: print a warning when RADV_TRAP_HANDLER is used

  • aco: add ACO_DEBUG=novn,noopt,nosched for debugging purposes

  • radv: emit {CB,DB}_RMI_L2_CACHE_CONTROL at framebuffer time

  • radv: set BIG_PAGE to improve performance on GFX10.3

  • aco: fix wrong source position for constant with nir_op_cube_face_coord

  • radv: dump shader stats with VK_KHR_pipeline_executable_properties

  • radv: force RADV_DEBUG=syncshaders when RADV_TRACE_FILE is used

  • radv: improve reporting faulty pipelines when a GPU hang is detected

  • radv: dump GPU info into the hang report

  • nir/algebraic: mark some optimizations with fsat(NaN) as inexact

  • spirv: fix retrieving dest type for OpFragmentMaskFetchAMD

  • radv,aco: disable opts if VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT

  • aco: handle unaligned loads on GFX10.3

  • spirv: fix emitting switch cases that directly jump to the merge block

  • radv: fix transform feedback crashes if pCounterBufferOffsets is NULL

  • radv: add a helper for loading meta descriptors

  • radv: do not lower UBO/SSBO access to offsets

  • radv: remove useless assignment of MAX_API_VERSION

  • radv: bump the advertised patch version to 145

  • radv: add VK_KHR_copy_commands2 but leave it disabled

  • radv: add support for CmdBlitImage2KHR()

  • radv: add support for CmdCopyBuffer2KHR()

  • radv: add support for CmdCopyBufferToImage2KHR()

  • radv: add support for CmdCopyImage2KHR()

  • radv: add support for CmdCopyImageToBuffer2KHR()

  • radv: cleanup selecting the hardware resolve path

  • radv: add support for CmdResolveImage2KHR()

  • radv: advertise VK_KHR_copy_commands2

  • radv: set KEEP_TOGETHER_ENABLE if necessary on GFX10+

  • radv: add a tweak for PS wave CU utilization for gfx10.3

  • ci: adjust RadeonSI rules

  • ci: add dEQP-VK.info.device_extensions to the list of skipped tests

  • nir/lower_memory_model: return progress when visiting instructions

  • nir/lower_memory_model: do not break with global atomic operations

  • ac/nir: implement nir_intrinsic_{load,store}_global

  • ac/nir: implement nir_intrinsic_global_atomic_*

  • radv: lower deref operations for global memory for both backends

  • ac/llvm: fix invalid IR if image stores are shrinked using the format

  • nir/lower_io: change nir_io_add_const_offset_to_base to use bitfield modes

  • radeonsi: call nir_io_add_const_offset_to_base only once per shader

  • radv/llvm: call nir_lower_io_to_vector with FS to fix array tests

  • radv: call nir_io_add_const_offset_to_base for FS outputs

  • radv: move lowering of FS outputs outside of ACO

  • radv: fix gathering writes_memory for global store/atomic operations

  • ac/llvm: fix invalid use of unreachable in ac_build_atomic_rmw()

  • ac/nir: fix nir_intrinsic_shared_atomic_fadd

  • radv: gather output usage mask from store_output for VS, TES and GS

  • radv/aco: lower IO for all stages outside of ACO

  • aco: apply the clamped integer addition disassembly workaround for v_add3

  • aco/tests: add disassembler tests to reproduce the add3+clamp crash

  • ac/llvm: adjust dmask when image stores are shrinked using the format

  • ac/nir: remove dead load/store deref code for temporary variables

  • radv/llvm: assign driver locations for VS, TCS, TES and GS correctly

  • radv/llvm: lower GS IO

  • radv/llvm: lower TES IO

  • radv/llvm: gather TCS outputs from the output variables

  • radv/llvm: lower TCS IO

  • radv/llvm: gather VS input usage mask from load_input

  • radv/llvm: lower VS IO

  • ac/llvm: implement nir_op_unpack_half_2x16_split_{x,y}

  • radv/llvm: enable lower_unpack_half_2x16

  • ac/nir: remove dead global load/store/atomic derefs code

  • ac/nir: remove dead shader IO code

  • radeonsi: remove dead code in TCS/TES/GS since const_index is always 0

  • ac,radv,radeonsi: remove unused parameters in the shader ABI IO

  • radv: remove unused gs.writes_memory in the shader info pass

  • radv: remove dead deref code in the shader info pass

  • ac/nir,radv: fix invalid IR when loading inline uniform blocks

  • nir/constant_folding: init nir_const_value to zero

  • aco: bail out if the NIR IO base offset isn’t zero

  • aco: more uses of nir_get_io_offset_src()

  • ac/nir: implement nir_op_fsat

  • radv/llvm: do not lower nir_op_fsat

  • radv/llvm: remove dead code for 64-bit GS inputs

  • aco: dump the program if the disassembler failed

  • radv/llvm: do not lower sub

  • radv: use the same NIR compiler options for both compiler backends

  • radv/llvm: stop assigning driver_location in NIR->LLVM

  • ac,radv,radeonsi: stop multiplying driver_location by 4

  • ac/nir: pass the variable location to store_tcs_outputs

  • radv/llvm: switch to NIR IO assigned locations

  • radv/llvm: reduce the ESGS itemsize by using NIR IO assigned locations

  • radv/llvm: reduce LDS size for tess by using NIR IO assigned locations

  • radv: remove one leftover TODO in the shader info pass

  • ac/llvm: move AC_FETCH_FORMAT to non-LLVM code

  • radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT

  • radv: move lower_io_arrays_to_elements before lower_io_to_scalar_early

  • radv: fix adjusting vertex alpha

  • aco: implement missing nir_op_unpack_half_2x16_split_{x,y}_flush_to_zero

  • radv/aco: disable NGG GS support because it randomly hangs the GPU

  • radv: fix ignoring the vertex attribute stride if set as dynamic

  • aco: remove stub lower_wqm() prototype

  • aco: remove useless occurences of radv_nir_compiler_options

  • aco: remove unused radv_shader.h includes

  • radv: move compiler statistics to ACO

  • aco: compute the CS workgroup size from the shader NIR info

  • aco: adjust an assertion about the wavesize in emit_gfx10_wave64_bpermute()

  • radv: fix optimizing needed states if some are marked as dynamic

  • ac/nir: implement missing nir_op_pack_half_2x16_split

  • radv: report latest extension spec versions

  • radv: add missing ‘discardtodemote’ option in the debug list

  • Revert “radv/aco: disable NGG GS support because it randomly hangs the GPU”

  • ac/nir: handle non-const offset with txf/txf_ms

  • radv: move all NIR pass outside of ACO

  • ac/nir: do not sign-extend the result of texop_samples_identical

  • radv,aco: fix use of texop_samples_identical in the resolve meta path

  • aco: fix determining if LOD is zero for nir_texop_txf/nir_texop_txs

  • ac/nir: ignore set_vertex_and_primitive_count intrinsic

  • ac/nir: abort when an unknown intrinsic is reached

  • ac: add an option to dump GPU info to a file

  • radv: add radv_dump_cmd() helper

  • radv: dump UMR ring and waves into the hang report

  • radv: dump GPU hang report logs into $HOME/radv_dumps_<pid>

  • radv: re-order GPU hang report dumps by usefulness

  • radv: replace RADV_TRACE_FILE by RADV_DEBUG=hang

  • radv: do not perform a FMASK expand for non-writeable MSAA images

  • radv: flush CB before and after FMASK_DECOMPRESS or DCC_DECOMPRESS

  • radv: enable VK_AMD_mixed_attachment_samples on GFX6-GFX7

  • radv,aco: adjust the sample mask only if per-sample shading is enabled

  • radv,aco: optimize computing the sample mask for per-sample shading

  • aco: store NIR range analysis data to the isel context

  • aco: select v_mul_{hi}_u32_u24 for 24-bit multiplications

  • nir/algebraic: distribute imul(iadd(a, b), c) when b and c are constants

  • aco: optimize v_and(a, v_subbrev_co(0, 0, vcc)) -> v_cndmask(0, a, vcc)

  • nir/algebraic: optimize bitfield_select(a, b, 0) to iand(a, b)

  • aco: fix combining add/sub to b2i if a new dest needs to be allocated

Serge Martin (13):

  • clover: set LLVM min version to 8.0.1

  • clover: implements clEnqueueMigrateMemObjects

  • clover: implements clEnqueueFillImage

  • clover: implements clGetKernelArgInfo

  • clover: bind sampler_t type to module::argument::sampler

  • clover: add CL_KERNEL_ATTRIBUTES for clGetKernelInfo

  • clover: implements clGetKernelWorkGroupInfo CL_KERNEL_COMPILE_WORK_GROUP_SIZE

  • clover: implements notification callback on program builds

  • clover: avoid adding an extra space to compiler options

  • clover: move tokenize function to algorithm

  • clover: validate image_row_pitch and image_slice_pitch in clEnqueueMapImage

  • clover: clCreateImage: calculate image row_pitch and slice_pitch when not provided

  • clover: implements clSetContextDestructorCallback

Suresh Guttula (2):

  • gallium: update abs_delta segementation parameter

  • radeon/vcn : Corrected dpb_size calculation for VP9_2

Tapani Pälli (16):

  • anv: add a check for depthStencilState before using it

  • anv: null check for buffer before reading size

  • anv: take depth in to account in anv_GetImageSubresourceLayout

  • mesa: refactor floating point texture fbo completeness check on gles

  • mesa: add EXT_color_buffer_half_float plumbing

  • mesa/st: enable EXT_color_buffer_half_float when formats supported

  • glsl: mark some builtins with correct glsl(es) version check

  • iris: remove additional pipe control done before hiz for older gens

  • glsl: take EXT_gpu_shader4 in to account when adding round

  • gallivm/nir: handle nir_op_flt in lp_build_nir_llvm

  • iris: fix the order of src and dst for fence memcpy

  • mesa/st: call memobj_destroy only if there is memory imported

  • mesa: do not throw _mesa_problem when invalid enum is used

  • mesa/st: use a lock to protect access to variants when updating them

  • egl/dri2: fix race between image create and egl_image_target_texture

  • iris: initialize shared screen->vtbl only once

Thong Thai (10):

  • radeon/vcn: fix jpeg decode for navi10

  • frontends/va: Add support for NV12/P010/P016 to vaDeriveImage

  • frontends/va: Derive image from interlaced buffers

  • frontends/va: Derive image from interlaced buffers in some cases

  • gallium: Parse packed HEVC SPS encode header for crop parameters

  • radeon: Pass HEVC encode crop parameters to the encoder

  • frontends/va: Enabled packed headers for H.264 encoder

  • gallium/auxiliary/vl: Include src region in scale_y calculation

  • frontends/va/postproc: Un-break field flag

  • frontends/va: Return P010/P016 as possible surface formats when encoding

Timothy Arceri (15):

  • i965: add support for force_gl_vendor

  • disk_cache: move cache dir generation into OS specific helper file

  • disk_cache: add disk_cache_enabled() helper

  • disk_cache: move index mmap into OS specific helper

  • disk_cache: move munmap into an OS specific helper

  • disk_cache: move evict_lru_item() to an OS specific helper

  • disk_cache: create new helper for writing cache items to disk

  • disk_cache: move get_cache_file() to an OS specific helper

  • disk_cache: add new OS specific helper disk_cache_evict_item()

  • disk_cache: move cache item loading code into disk_cache_load_item() helper

  • glsl: don’t duplicate state vars as uniforms in the NIR linker

  • util/disk_cache: remove unused function param

  • glsl: relax rule on varying matching for shaders older than 4.00

  • glsl: add extra pp tokens workaround and enable for CoR

  • glsl: drop NMS OpenGL workarounds

Timur Kristóf (50):

  • aco: Fix unused variable warning by adding ASSERTED.

  • aco: Fix convert_to_SDWA when instruction has 3 operands.

  • aco: Move README to README-ISA

  • aco: Fixup markdown formatting of the README-ISA.

  • aco: Add README which explains about what ACO is and how it works.

  • aco: Fix emit_boolean_exclusive_scan in wave32 mode.

  • aco: Clean up emit_mbcnt.

  • aco: Add base argument to emit_mbcnt.

  • aco: Use NIR IO semantics for tess factor IO locations.

  • radv/aco: Set I/O variable locations outside ACO.

  • nir: Add ability to count emitted GS primitives.

  • nir: Add ability to count emitted GS vertices per primitive.

  • nir: Add ability to overwrite incomplete GS primitives.

  • nir: Count vertices per stream.

  • nir: Add ability to count primitives per stream.

  • radv/aco: Use new GS lowering options for ACO with NGG GS.

  • aco: Clarify missing export error message in assembler.

  • aco: Extract lanecount_to_mask to a separate function.

  • aco: Extract thread_id_in_threadgroup to a separate function.

  • aco: Use thread_id_in_threadgroup helper for ES outputs.

  • aco: Optimize thread_id_in_threadgroup when there is just one wave.

  • aco: Add wave-specific opcode for s_lshl and s_flbit.

  • aco/ngg: Refactor gs_alloc_req in preparation for NGG GS.

  • aco/ngg: Refactor ngg_emit_prim_export in preparation for NGG GS.

  • aco/ngg: Make primitive export packing less prone to error.

  • aco/ngg: Clean up and reorganize NGG VS/TES code.

  • aco/ngg: Allow NGG GS to store ES outputs.

  • aco/ngg: Allow NGG GS to load per-vertex GS inputs.

  • aco/ngg: Allow NGG GS to create VS exports.

  • aco/ngg: Setup NGG GS.

  • aco/ngg: Create LDS layout for NGG GS.

  • aco/ngg: Implement workgroup reduce / exclusive scan for NGG GS.

  • aco/ngg: Implement NGG GS output.

  • aco/ngg: Place workgroup barrier outside control flow for NGG GS.

  • aco/ngg: Add shader query support to NGG GS.

  • radv/aco: Enable NGG GS by default.

  • aco/ngg: Use more efficient LDS layout to help reduce bank conflicts.

  • aco/ngg: Allocate NGG GS space early for const vertex/primitive counts.

  • aco/ngg: Calculate workgroup size of NGG shaders.

  • nir: Emit set_vertex_and_primitive_count for inactive streams.

  • aco/ngg: Add assertion to make sure we always know the vertex count.

  • aco: Assert that workgroup barriers are not used inappropriately.

  • aco/ngg: Put shader query reduction operand into a VGPR.

  • aco: Add some validation for PSEUDO_REDUCTION instructions.

  • aco: Make emitting reduction instructions a bit more convenient.

  • aco: Add a few assertions about LDS usage.

  • aco/ngg: Export a zero-area triangle when primitive count is 0.

  • aco/ngg: Incorporate GS invocations into workgroup size calculation.

  • aco/optimizer: Only set scc_needed when it is actually needed.

  • aco: Fix NGG GS assert failure from the WG scan.

Tomeu Vizoso (16):

  • Revert “CI: temp disable t720/t760 jobs.”

  • Revert “CI: Disable Panfrost T720/T760 CI”

  • ci: Split traces.yml file per driver

  • ci: Test Panfrost with more traces

  • ci: Fix URL to imagediff page in traces dashboard

  • ci: Update kernel used in LAVA to 5.8-based drm-misc

  • ci: Run deqp-gles2 on RadeonSI

  • ci: Run deqp-gles3 and deqp-gles31 on RadeonSI

  • ci: Update kernel for LAVA

  • ci: Test Panfrost on Khadas VIM3 boards

  • ci: Disable pm_runtime and max clocks in LAVA jobs

  • ci: Unskip fragment_ops tests on Bifrost

  • virgl: Correctly align size of blobs

  • ci: Update kernel for LAVA to 5.10-rc2 plus patches

  • ci: Update dEQP skips and fails for Bifrost on G52

  • ci: Distribute ADMGPU driver to LAVA as a module

Tony Wasserka (26):

  • nir/lower_idiv: Port recent LLVM fixes to emit_udiv

  • radv: Fix various non-critical integer overflows

  • aco: Fix integer overflows when emitting parallel copies during RA

  • amd/common: Fix various non-critical integer overflows

  • aco/isel: Turn the function template emit_load into a proper function

  • aco/isel: Simplify nested branching code

  • aco/isel: Consistently use references for input parameters in emit_load

  • aco/isel: Remove unused definitions

  • aco/isel: Move context initialization code to a dedicated file

  • aco/isel: Move add_startpgm to aco_instruction_selection.cpp

  • aco/isel: Compile all helper functions with static linkage

  • nir: Fix undefined behavior due to signed integer multiplication overflows

  • nir: Fix unaligned pointer access

  • radv: Avoid calling memcpy with null pointers

  • radv: Fix unaligned memory access when writing specialization map entries

  • radv: Clean up CreateDescriptorSetLayout

  • radv: Respect alignment requirements in descriptor set layouts

  • aco/isel: Fix out-of-bounds write in visit_load_input

  • aco/isel: Always export position data from VS/NGG

  • aco/isel: Remove some dead code

  • aco/isel: Remove now unused VS-related code from create_null_export

  • aco: Use strong typing to model SW<->HW stage mappings

  • aco: Clean up symbol names and comments related to NGG

  • aco/isel: Miscellaneous cleanups using the new Stage API

  • aco/ra: Fix counting of subdword variables in get_reg_create_vector

  • aco: Fix format string used when raising validation errors

Veerabadhran Gopalakrishnan (1):

  • frontends/va: Added protected playback support for VP9

Vinson Lee (64):

  • util: Fix memory leaks in unit test.

  • meson: Fix lmsensors warning message.

  • radv/winsys: Fix memory leak.

  • vulkan: Fix memory leaks.

  • panfrost: Fix gnu-empty-initializer errors.

  • freedreno: Fix file descriptor leak.

  • svga: Fix unused printf argument.

  • spirv: Initialize spirv_test member shader.

  • nv50/ir: Add fallthrough statement.

  • nv50/ir: Remove duplicate mask assignment.

  • ac/llvm: Fix nonportable sizeof.

  • freedreno: Check file descriptor before write.

  • nv50/ir: Initialize Converter members.

  • libgl-gdi: Fix unused-variable warnings.

  • disk_cache: Fix filename leak on error path.

  • radesonsi: Remove unsigned comparison to zero.

  • panfrost: Delete debug allocated syncobj.

  • turnip: Release bo_mutex lock before potential error path.

  • pan/bi: Fix typo.

  • glsl: Initialize ir_constant member const_elements in all constructors.

  • r600/sfn: Initialize GPRValue member m_pin_to_channel.

  • gallium/dri2: Move image->texture assignment after image NULL check.

  • panfrost: Remove extra printf arguments.

  • anv: Check file descriptor before closing.

  • aco: Initialize mad_info member literal_idx.

  • gallium/swr: Remove unreachable code.

  • pan/mdg: Fix memory leak on error path.

  • lima: Print usage if –help is any of the arguments.

  • radv: Fix asserts using assign instead of compare.

  • nv50/ir: Initialize Source members.

  • freedreno: Move rsc NULL check to before rsc dereferences.

  • intel/vec4: Remove leftover code from Gen8+ removal.

  • glsl: Initialize ast_node member field location.path in constructor.

  • meson: Use more portable compiler option -std.

  • swr/rasterizer: Remove BuilderGfxMem member mpTrackMemAccessFuncTy.

  • util/xmlconfig: Initialize xmlconfig member options in constructor.

  • svga: Remove unused printf argument.

  • glsl: Initialize ir_to_mesa_visitor members in constructor.

  • v3dv: Fix assert using assign instead of compare.

  • glsl: Initialize lower_ubo_reference_visitor members in constructor.

  • glsl: Initialize add_uniform_to_shader member var in constructor.

  • v3dv: Remove unsigned comparison to zero.

  • v3dv: Initialize time before usage by free_stale_bos.

  • panfrost: Fix stride for AFBC_FORMAT_MOD_BLOCK_SIZE_32x8.

  • v3dv: Fix assert using assign instead of compare.

  • glsl: Initialize ir_if_to_cond_assign_visitor members in constructor.

  • glsl: Initialize lower_shared_reference_visitor members.

  • scons/windows: Support build with LLVM 11.

  • amd/addrlib: Initialize Gfx10Lib members in constructor.

  • Fix VMware capitalization.

  • glsl: Update loop_terminator constructor to accept parameters.

  • draw: Remove draw_install_aaline_stage dead code.

  • os: Fix open result check.

  • gallium: Remove duplicate resource variable.

  • tgsi: Initialize tgsi_declaration_dimension padding.

  • radesonsi: Remove unnecessary shader->selector NULL check.

  • amd/addrlib: Add missing va_end.

  • v3dv: Remove unsigned comparison to zero.

  • st/nine: Remove unnecessary NULL check.

  • turnip: Fix file descriptor return.

  • vdpau: Add missing printf format specifier.

  • frontends/va: Fix *num_entrypoints check.

  • clover/spirv: Add missing break for SpvOpExecutionMode case.

  • turnip: Close sync_fd only if it is a valid file descriptor.

Woody Chow (1):

  • st/mesa: Fix EGLImageTargetTexture2D for GL_TEXTURE_2D

Yevhenii Kolesnikov (1):

  • nir/large_constants: only search for constant duplicates

Yogesh Mohan Marimuthu (1):

  • src/mesa: add GL_NV_half_float extension support (v2)

jzielins (4):

  • gallium/swr: Fix compilation with LLVM 12

  • gallium/swr: Fix compilation TCS/TES compilation issues

  • swr: Fix crashes on non-AVX hardware

  • swr: Use ElemenCount constructor for LLVM 11

n00b7 (1):

  • v3dv/device: handle primary nodes for newer kernels

orbea (1):

  • spirv/vtn_cfg.c: Include util/debug.h for env_var_as_boolean.

zhu yong (1):

  • meson: add support for loongson’s mips/mips64 arch.