opengl-es - 如何减少 OpenGL/WebGL 中的绘图调用-6ren

opengl-es - 如何减少 OpenGL/WebGL 中的绘图调用

转载作者：行者123 更新时间：2023-12-02 22:18:12

当我读到 OpenGL/WebGL 的性能时，我几乎听说要减少绘制调用。所以我的问题是我只使用 4 个顶点来绘制纹理四边形。这意味着通常我的 vbo 只包含 4 个顶点。
基本上

gl.bindBuffer(gl.ARRAY_BUFFER,vbo);
gl.uniformMatrix4fv(matrixLocation, false, modelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN,0, vertices.length/3);

我看到的问题来了。在绘制之前，我更新了当前四边形的模型矩阵。例如沿 y 轴移动 5 个单位。

所以我必须:

gl.bindBuffer(gl.ARRAY_BUFFER,vbo);
gl.uniformMatrix4fv(matrixLocation, false, modelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN, 0, vertices.length/3);

gl.uniformMatrix4fv(matrixLocation, false, anotherModelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN,0, vertices.length/3);
....// repeat until all textures are rendered

我怎么可能减少绘制调用？或者甚至将其减少到只有一个绘图调用。

最佳答案

第一个问题是，重要吗？

如果您的收入少于 1000，甚至可能是 2000，则绘制调用可能无关紧要。易于使用比大多数其他解决方案更重要。

如果你真的需要很多四边形，那么有很多解决方案。一种是将 N 个四边形放入单个缓冲区中。 See this presentation .然后将位置、旋转和缩放放入其他缓冲区或纹理中，并计算着色器内的矩阵。

换句话说，对于带纹理的四边形，人们通常将顶点位置和 texcoords 放在像这样排序的缓冲区中

p0, p1, p2, p3, p4, p5,   // buffer for positions for 1 quad
t0, t1, t2, t3, t4, t5,   // buffer for texcoord for 1 quad

相反，你会这样做

p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, ...  // positions for N quads
t0, t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11, ...  // texcoords for N quads

p0 - p5 只是单位四边形值，p6 - p11 是相同的值，p12 - p17 也是相同的值。 t0 - t5 是单位 texcoord 值，t6 - t11 是相同的 texcoord 值。等等

然后你添加更多的缓冲区。让我们想象一下，我们想要的只是世界位置和规模。所以我们再添加 2 个缓冲区

s0, s0, s0, s0, s0, s0, s1, s1, s1, s1, s1, s1, s2, ...  // scales for N quads
w0, w0, w0, w0, w0, w0, w1, w1, w1, w1, w1, w1, w2, ...  // world positions for N quads

请注意比例如何重复 6 次，第一个四边形的每个顶点一次。然后为下一个四边形再次重复 6 次，依此类推。与世界位置相同。所以一个四边形的所有 6 个顶点共享相同的世界位置和相同的比例。

现在在着色器中我们可以使用这样的

attribute vec3 position;
attribute vec2 texcoord;
attribute vec3 worldPosition;
attribute vec3 scale;

uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
   // Assuming we want billboards (quads that always face the camera)
   vec3 localPosition = (camera * vec4(position * scale, 0)).xyz;

   // make quad points at the worldPosition
   vec3 worldPos = worldPosition + localPosition;

   gl_Position = projection * view * vec4(worldPos, 1);

   v_texcoord = texcoord; // pass on texcoord to fragment shader
}

现在，当我们想要设置四边形的位置时，我们需要在相应的缓冲区中设置 6 个世界位置(6 个顶点中的每一个)。

一般来说，您可以更新所有世界位置，然后调用 1 电话到 gl.bufferData上传所有这些。

这是 100k 四边形

const vs = `
attribute vec3 position;
attribute vec2 texcoord;
attribute vec3 worldPosition;
attribute vec2 scale;

uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
   // Assuming we want billboards (quads that always face the camera)
   vec3 localPosition = (camera * vec4(position * vec3(scale, 1), 0)).xyz;

   // make quad points at the worldPosition
   vec3 worldPos = worldPosition + localPosition;

   gl_Position = projection * view * vec4(worldPos, 1);

   v_texcoord = texcoord; // pass on texcoord to fragment shader
}
`;

const fs = `
precision mediump float;
varying vec2 v_texcoord;
uniform sampler2D texture;
void main() {
  gl_FragColor = texture2D(texture, v_texcoord);
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");

// compiles and links shaders and looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const numQuads = 100000;
const positions = new Float32Array(numQuads * 6 * 2);
const texcoords = new Float32Array(numQuads * 6 * 2);
const worldPositions = new Float32Array(numQuads * 6 * 3);
const basePositions = new Float32Array(numQuads * 3); // for JS
const scales = new Float32Array(numQuads * 6 * 2);
const unitQuadPositions = [
   -.5, -.5, 
    .5, -.5,
   -.5,  .5,
   -.5,  .5,
    .5, -.5,
    .5,  .5,
];
const unitQuadTexcoords = [
    0, 0,
    1, 0,
    0, 1,
    0, 1,
    1, 0,
    1, 1,
];

for (var i = 0; i < numQuads; ++i) {
  const off3 = i * 6 * 3;
  const off2 = i * 6 * 2;
  
  positions.set(unitQuadPositions, off2);
  texcoords.set(unitQuadTexcoords, off2);
  const worldPos = [rand(-100, 100), rand(-100, 100), rand(-100, 100)];
  const scale = [rand(1, 2), rand(1, 2)];
  basePositions.set(worldPos, i * 3);
  for (var j = 0; j < 6; ++j) {
    worldPositions.set(worldPos, off3 + j * 3);
    scales.set(scale, off2 + j * 2);
  }
}

const tex = twgl.createTexture(gl, {
  src: "http://i.imgur.com/weklTat.gif",
  crossOrigin: "",
  flipY: true,
});

// calls gl.createBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: { numComponents: 2, data: positions, },
  texcoord: { numComponents: 2, data: texcoords, },
  worldPosition: { numComponents: 3, data: worldPositions, },
  scale: { numComponents: 2, data: scales, },
});

function render(time) {
   time *= 0.001; // seconds
   
   twgl.resizeCanvasToDisplaySize(gl.canvas);
   
   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
   gl.enable(gl.DEPTH_TEST);
   
   gl.useProgram(programInfo.program);
   
   // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
   
   const fov = Math.PI * .25;
   const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
   const zNear = .1;
   const zFar = 200;
   const projection = m4.perspective(fov, aspect, zNear, zFar);
   
   const radius = 100;
   const tm = time * .1
   const eye = [Math.sin(tm) * radius, Math.sin(tm * .9) * radius, Math.cos(tm) * radius];
   const target = [0, 0, 0];
   const up = [0, 1, 0];
   const camera = m4.lookAt(eye, target, up);
   const view = m4.inverse(camera);
   
   // calls gl.uniformXXX
   twgl.setUniforms(programInfo, { 
     texture: tex,
     view: view,
     camera: camera,
     projection: projection,
   });
   
   // update all the worldPositions
   for (var i = 0; i < numQuads; ++i) {
     const src = i * 3;
     const dst = i * 6 * 3;
     for (var j = 0; j < 6; ++j) {
       const off = dst + j * 3;
       worldPositions[off + 0] = basePositions[src + 0] + Math.sin(time + i) * 10;
       worldPositions[off + 1] = basePositions[src + 1] + Math.cos(time + i) * 10;
       worldPositions[off + 2] = basePositions[src + 2];
     }
   }
   
   // upload them to the GPU
   gl.bindBuffer(gl.ARRAY_BUFFER, bufferInfo.attribs.worldPosition.buffer);
   gl.bufferData(gl.ARRAY_BUFFER, worldPositions, gl.DYNAMIC_DRAW);
   
   // calls gl.drawXXX
   twgl.drawBufferInfo(gl, bufferInfo);
   
   requestAnimationFrame(render);
}
requestAnimationFrame(render);

function rand(min, max) {
  if (max === undefined) {
     max = min;
     min = 0;
  }
  return Math.random() * (max - min) + min;
}

body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }

<script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas />

您可以使用 ANGLE_instance_arrays 将重复顶点的数量从 6 个减少到 1 个。延期。它没有上面的技术那么快，但非常接近。

您还可以通过将世界位置和比例存储在纹理中来将数据量从 6 减少到 1。在这种情况下，您添加一个仅具有重复 id 的额外缓冲区，而不是 2 个额外的缓冲区

// id buffer
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3 ....

id 重复 6 次，每个四边形的 6 个顶点中的每一个重复一次。

然后，您使用该 id 计算纹理坐标以查找世界位置和比例。

attribute float id;
...

uniform sampler2D worldPositionTexture;  // texture with world positions
uniform vec2 textureSize;               // pass in the texture size

...

  // compute the texel that contains our world position
  vec2 texel = vec2(
     mod(id, textureSize.x),
     floor(id / textureSize.x));

  // compute the UV coordinate to access that texel
  vec2 uv = (texel + .5) / textureSize;

  vec3 worldPosition = texture2D(worldPositionTexture, uv).xyz;

现在你需要把你的世界位置放在一个纹理中，你可能想要一个浮点纹理来让它变得容易。你可以对比例等做类似的事情，或者将每个存储在一个单独的纹理中，或者全部存储在同一个纹理中，适本地改变你的 uv 计算。

const vs = `
attribute vec3 position;
attribute vec2 texcoord;
attribute float id;

uniform sampler2D worldPositionTexture;  
uniform sampler2D scaleTexture;          
uniform vec2 textureSize;  // texture are same size so only one size needed
uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
  // compute the texel that contains our world position
  vec2 texel = vec2(
     mod(id, textureSize.x),
     floor(id / textureSize.x));

  // compute the UV coordinate to access that texel
  vec2 uv = (texel + .5) / textureSize;

  vec3 worldPosition = texture2D(worldPositionTexture, uv).xyz;
  vec2 scale = texture2D(scaleTexture, uv).xy;

  // Assuming we want billboards (quads that always face the camera)
  vec3 localPosition = (camera * vec4(position * vec3(scale, 1), 0)).xyz;

  // make quad points at the worldPosition
  vec3 worldPos = worldPosition + localPosition;

  gl_Position = projection * view * vec4(worldPos, 1);

  v_texcoord = texcoord; // pass on texcoord to fragment shader
}
`;

const fs = `
precision mediump float;
varying vec2 v_texcoord;
uniform sampler2D texture;
void main() {
  gl_FragColor = texture2D(texture, v_texcoord);
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");
const ext = gl.getExtension("OES_texture_float");
if (!ext) {
  alert("Doh! requires OES_texture_float extension");
}
if (gl.getParameter(gl.MAX_VERTEX_TEXTURE_IMAGE_UNITS) < 2) {
  alert("Doh! need at least 2 vertex texture image units");
}

// compiles and links shaders and looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const numQuads = 50000;
const positions = new Float32Array(numQuads * 6 * 2);
const texcoords = new Float32Array(numQuads * 6 * 2);
const ids = new Float32Array(numQuads * 6);
const basePositions = new Float32Array(numQuads * 3); // for JS
// we need to pad these because textures have to rectangles
const size = roundUpToNearest(numQuads * 4, 1024 * 4)
const worldPositions = new Float32Array(size);
const scales = new Float32Array(size);
const unitQuadPositions = [
   -.5, -.5, 
    .5, -.5,
   -.5,  .5,
   -.5,  .5,
    .5, -.5,
    .5,  .5,
];
const unitQuadTexcoords = [
    0, 0,
    1, 0,
    0, 1,
    0, 1,
    1, 0,
    1, 1,
];

for (var i = 0; i < numQuads; ++i) {
  const off2 = i * 6 * 2;
  const off4 = i * 4;
  
  // you could even put these in a texture OR you can even generate
  // them inside the shader based on the id. See vertexshaderart.com for
  // examples of generating positions in the shader based on id
  positions.set(unitQuadPositions, off2);
  texcoords.set(unitQuadTexcoords, off2);
  ids.set([i, i, i, i, i, i], i * 6);

  const worldPos = [rand(-100, 100), rand(-100, 100), rand(-100, 100)];
  const scale = [rand(1, 2), rand(1, 2)];
  basePositions.set(worldPos, i * 3);
    
  for (var j = 0; j < 6; ++j) {  
    worldPositions.set(worldPos, off4 + j * 4);    
    scales.set(scale, off4 + j * 4);
  }
}

const tex = twgl.createTexture(gl, {
  src: "http://i.imgur.com/weklTat.gif",
  crossOrigin: "",
  flipY: true,
});

const worldPositionTex = twgl.createTexture(gl, {
  type: gl.FLOAT,
  src: worldPositions,
  width: 1024,
  minMag: gl.NEAREST,
  wrap: gl.CLAMP_TO_EDGE,
});

const scaleTex = twgl.createTexture(gl, {
  type: gl.FLOAT,
  src: scales,
  width: 1024,
  minMag: gl.NEAREST,
  wrap: gl.CLAMP_TO_EDGE,
});

// calls gl.createBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: { numComponents: 2, data: positions, },
  texcoord: { numComponents: 2, data: texcoords, },
  id: { numComponents: 1, data: ids, },
});

function render(time) {
   time *= 0.001; // seconds
   
   twgl.resizeCanvasToDisplaySize(gl.canvas);
   
   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
   gl.enable(gl.DEPTH_TEST);
   
   gl.useProgram(programInfo.program);
   
   // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
   
   const fov = Math.PI * .25;
   const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
   const zNear = .1;
   const zFar = 200;
   const projection = m4.perspective(fov, aspect, zNear, zFar);
   
   const radius = 100;
   const tm = time * .1
   const eye = [Math.sin(tm) * radius, Math.sin(tm * .9) * radius, Math.cos(tm) * radius];
   const target = [0, 0, 0];
   const up = [0, 1, 0];
   const camera = m4.lookAt(eye, target, up);
   const view = m4.inverse(camera);
   
   // update all the worldPositions
   for (var i = 0; i < numQuads; ++i) {
     const src = i * 3;
     const dst = i * 3;
     worldPositions[dst + 0] = basePositions[src + 0] + Math.sin(time + i) * 10;
     worldPositions[dst + 1] = basePositions[src + 1] + Math.cos(time + i) * 10;
     worldPositions[dst + 2] = basePositions[src + 2];
   }
   
   // upload them to the GPU
   const width = 1024;
   const height = worldPositions.length / width / 4;
   gl.bindTexture(gl.TEXTURE_2D, worldPositionTex);
   gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, width, height, 0, gl.RGBA, gl.FLOAT, worldPositions); 
   
   // calls gl.uniformXXX, gl.activeTeture, gl.bindTexture
   twgl.setUniforms(programInfo, { 
     texture: tex,
     scaleTexture: scaleTex,
     worldPositionTexture: worldPositionTex,
     textureSize: [width, height],
     view: view,
     camera: camera,
     projection: projection,
   });
   
   // calls gl.drawXXX
   twgl.drawBufferInfo(gl, bufferInfo);
   
   requestAnimationFrame(render);
}
requestAnimationFrame(render);

function rand(min, max) {
  if (max === undefined) {
     max = min;
     min = 0;
  }
  return Math.random() * (max - min) + min;
}

function roundUpToNearest(v, round) {
  return ((v + round - 1) / round | 0) * round;
}

body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }

<script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas />

请注意，至少在我的机器上通过纹理进行操作比通过缓冲区进行操作要慢，因此虽然 JavaScript 的工作量较少(每个四边形只有一个 worldPosition 更新)，但 GPU 的工作量显然更大(至少在我的机器上)。对于我来说，缓冲版本以 60fps 的速度运行 100k 四边形，而纹理版本以大约 40fps 的速度运行 100k 四边形。我将它降低到 50k，但当然这些数字是针对我的机器的。其他机器会很。

像这样的技术可以让你拥有更多的四边形，但它是以牺牲灵 active 为代价的。您只能以您在着色器中提供的方式操作它们。例如，如果您希望能够从不同的原点(中心、左上角、右下角等)进行缩放，则需要添加另一条数据或设置位置。如果要旋转，则需要添加旋转数据等...

您甚至可以在每个四边形中传递整个矩阵，但随后您将在每个四边形上传 16 个浮点数。它仍然可能更快，因为您在调用 gl.uniformMatrix4fv 时已经这样做了但你只会打两个电话， gl.bufferData或 gl.texImage2D上传新矩阵，然后 gl.drawXXX画。

还有一个问题是你提到了纹理。如果每个四边形使用不同的纹理，那么您需要弄清楚如何将它们转换为纹理图集(一个纹理中的所有图像)，在这种情况下，您的 UV 坐标不会像上面那样重复。

关于opengl-es - 如何减少 OpenGL/WebGL 中的绘图调用，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42473044/

文章推荐： typescript - AWS Node SDK没有HttpClient

文章推荐： ruby-on-rails - Rails 3 两个字段组合时的唯一性验证

文章推荐： c# - 如何在 TreeView 中搜索 child 点头

Bison 减少/减少
我是 Bison 解析的新手，我无法理解它是如何工作的。我有以下语法，其中我保留了最低限度的语法来突出问题。 %left '~' %left '+' %token T_VARIABLE %% star
hadoop链映射/减少
我链接了 2 个映射器和 1 个缩减器。是否可以将中间输出(链中每个映射器的 o/p)写入 HDFS？我尝试为每个设置 OutputPath，但它似乎不起作用。现在，我不确定是否可以完成。有什么建议吗
boost - 路径简化/减少
我正在编写一些代码来管理自定义磁盘文件结构并将其同步到未连接的系统。我的要求之一是能够在实际生成同步内容之前估计同步的大小。作为一个简单的解决方案，我整理了一个包含完整路径文件名的 map ，作为高效
CouchDB 在运行时通过任何文档属性映射/减少？
我来自一个 SQL 世界，其中查找由多个对象属性(published = TRUE 或 user_id = X)完成，并且有任何地方都没有加入 (因为 1:1 缓存层)。文档数据库似乎很适合我的数据
减少 R 中向量元素的总和
在 R 中，我有一个整数向量。从这个向量中，我想随机减少每个整数元素的值，以获得向量的总和，即初始总和的百分比。在这个例子中，我想将向量“x”减少到向量“y”，其中每个元素都被随机减少以获得等于初始
scala - 减少/折叠幺半群列表但减少器返回任一
我发现自己遇到过几次我有一个 reducer /组合 fn 的情况，如下所示: def combiner(a: String, b: String): Either[String, String]
nginx - 减少+30秒的HLS延迟
Ubuntu 12.04 nginx 1.2.4 avconv版本 avconv version 0.8.10-4:0.8.10-0ubuntu0.12.04.1, Copyright (c) 200
减少 R 中的行数
我是 R 编程语言的新手。我有一个包含 2 列(ID 和 Num)的数据集，如下所示: ID Num 3 8 3 12 4 15 4 18 4
减少 R 中的左折叠
我正在使用高阶函数将函数应用于向量中的每个元素并将结果作为标量值返回。假设我有: v = c(0, 1, 2, 3, 4, 5, 6, 7, 8) 我想计算以左边 5 个整数为中心的所有这些整数的总
减少 lapply 返回的元素数量
关闭。这个问题需要debugging details .它目前不接受答案。编辑问题以包含 desired behavior, a specific problem or error, and th
减少 R 中函数的额外参数
这个问题在这里已经有了答案: How to write the dataframes in a list to a single csv file (2 个回答) 5年前关闭。我正在尝试使用 Red
cuda - 减少 CUDA
刚开始学习CUDA编程，对归约有些迷茫。我知道与共享内存相比，全局内存有很多访问延迟，但我可以使用全局内存来(至少)模拟类似于共享内存的行为吗？例如，我想对长度恰好为 BLOCK_SIZE * T
.net - 减少.NET中的PNG文件大小
我经常使用OptiPNG或pngcrush减小PNG图像的文件大小。我希望能够从.NET应用程序中以编程方式执行此类操作。我正在动态生成要发送到移动设备的PNG，因此我想减小文件大小。图像质量很重
Clojure:减少，减少和无限列表
减少和减少让您在序列上累积状态。序列中的每个元素都会修改累积的状态，直到到达序列的末尾。在无限列表上调用reduce 或reductions 有什么含义？ (def c (cycle [0]))
R:传递多个参数来累加/减少
这与R: use the newly generated data in the previous row有关我意识到我面临的实际问题比我在上面的线程中给出的示例要复杂一些 - 似乎我必须将 3 个
fonts - 减少.ttf字体大小的方法？
有什么办法可以减少.ttf字体的大小？即如果我们要删除一些我们不使用的glyps。最佳答案使用Google Web Fonts，您可以限制字符集，例如: //fonts.googleapis.co
ios - 减少/减少我的应用程序中的背景ipod声音
我需要在iOS中制作一个应用程序，在她的工作过程中发出类似“哔”的声音。我已经使用MPMusicPlayerController实现了与背景ipod的交互。问题: 由于来自ipod的音乐音量很大，
Scala - 减少/向左折叠
我有一个嵌套 map m，如下所示: m = Map("电子邮件"-> "a@b.com", "背景"-> Map("语言"-> "英语")) 我有一个数组arr = Array("backgroun
hadoop - 转发可写的映射/减少
有什么原因为什么不应该转发map / reduce函数中收到的可写内容？我的意思是-每个map / reduce函数都有一个可写的键/值，并可能发出一个键/值对。如果我想执行一些过滤，我应该只发出接
kotlin - 减少/折叠中的两个累加器
假设我有一个数据列表 val data = listOf("F 1", "D 2", "U 1", "D 3", "F 10") 我想执行每个元素的给定逻辑。我必须在外部添加 var acc2 =

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

opengl-es - 如何减少 OpenGL/WebGL 中的绘图调用