c - 流式细胞术 FCS 文件数据段，线性数据似乎有偏差-6ren

c - 流式细胞术 FCS 文件数据段，线性数据似乎有偏差

转载作者：行者123 更新时间：2023-12-04 21:58:29

最后一次更新(我保证)

问题的核心是遍历数据，正如 Jonathan Leffler 所暗示的那样。二进制数据“排列”在矩阵中。例如，如果我有 3 个事件和 4 个位宽为 8 的参数，则二进制数据

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

看起来像

00 00 00 00
00 00 00 00
00 00 00 00
00 00 00 00

我有两个 for 循环 i 和 j，我需要使用它来计算偏移量。

我最初有

(i * PAR * 2) + (j * PnB/8)

其中 PAR 是参数个数，PnB 是位宽，i 是从 0 到总事件数，j 是从 0 到 PAR。这是不正确的，并且不确定我是如何得到这个公式的。

我正在开发内部流量分析软件，但遇到了一些问题。我用来测试软件的 FCS 示例数据文件是在 MacOS 9 CellQuest 上使用 FACSCaliber 生成的。当我提取 FSC-H 和 SSC-H 的数据点时，我得到的结果与我在其他流软件(即 FlowJo)上得到的结果不同。我了解在 MacOS 9 CellQuest 上生成的数据以大端顺序存储，并且相信我正确地转换了数据:

for (int i = 0; i < params[j-1].PnB/8; ++i)
{
    lebyte[i] = (bytes[(params[j-1].PnB/8)-1-i] & 0xff) << i*8u;
    cx |= lebyte[i];
}

代码可能不够优雅，但它似乎可以按照我的预期对已知数据样本进行处理。

PnB是位宽PnR为 channel 取值范围

我在使用实际流量数据时得到的结果看起来是正确的，因为值在 PnR 指定的范围内，即如果 PnR = 1024，则存储在 16 位空间中的数据在 0 - 1023 之间。

但是，当我绘制数据时，我得到了一个偏斜的点图，其中散点向 FSC-H x 轴弯曲。

以下是 FCS 3.1 标准(流式细胞术数据文件标准，国际细胞术促进协会；第 13 页)的摘录:

$BYTEORD/n1,n2,n3,n4/ $BYTEORD/4,3,2,1/ [REQUIRED]

This keyword specifies the endianness of the data, i.e., the byte order used to binary store numeric data values in the data set. This value of the keyword corresponds to the order from numerically least significant {1} to numerically most significant {4} in which four binary data bytes are written to compose a 32-bit word in the data acquisition computer. The numbers are separated by commas (ASCII 44). Only two distinct values are allowed:

$BYTEORD/1,2,3,4/ (little endian, i.e., least significant byte written first, e.g., x86 based personal computers)

$BYTEORD/4,3,2,1/ (big endian, i.e., least significant byte is written last, e.g., PowerPC including older Apple Macintosh computers prior to switch to Intel-based architecture) One of these values shall be used to specify the endianness even if the size of data values exceeds 32 bits ($DATATYPE/D/)

如果我没有很好地解释，我会提前道歉，并且很乐意根据需要进一步澄清任何要点。非常感谢任何帮助。

更新附上图片以说明要点。图1

Skewed SCC-H x FSC-H

更新 2

我制作了一个简化版的字节序转换器并进行了测试。

#include <stdio.h>
#include <stdlib.h>

int main() {
    int PnB = 16; // bitwidth of data stored for a specific channel value
    // for example the data value for sample A is stored in 16 bits.
    char bytes[PnB/8];
    unsigned int lebyte[PnB/8];
    unsigned int cx = 0;

    unsigned int b0, b1;

    /*  |  [0] |  [1] |
    *  | 0xff | 0x03 |
    */
    bytes[1] = 0x03;
    bytes[0] = 0xff;

    // in big endian print out

    b0 = (bytes[0] & 0xff) << 8u;
    b1 = bytes[1] & 0xff;

    cx = b0 | b1;

    printf("%d\n", cx);

    cx = 0;

    // convert to little endian

    for (int i = 0; i < PnB/8; ++i)
    {
        lebyte[i] = (bytes[i] & 0xff) << i*8u;
        cx |= lebyte[i];
    }

    printf("%d\n", cx);
}

结果正确:

65283

1023

所以我对原始源代码进行了更正(在原始帖子中进行了更改以节省空间)。

此外，我正在执行 memcpy 以从缓冲区复制数据。

memset(bytes, '\0', sizeof(char)*params[j-1].PnB/8);
memcpy(bytes, databuf+((i*data->PAR*2)+(j*params[j-1].PnB/8)), params[j-1].PnB/8);

但我仍然得到一个有偏差的数据。这可能与数据的转换方式有关。我只是在网上找不到任何信息，我相信 FlowJo 的制造商不会愿意分享这个 secret ；)。我会继续寻找，看看我找到了什么。

更新 3

抱歉让它变长了，但还有更多信息:

typedef struct _fcs_parameter {
    double f1;          // logarithmic decade
    double f2;          // minimum value on log scale
    unsigned int PnB;   // bitwidth
    unsigned int PnR;   // range
    fcs_events *events; // event data
    char *sname;        // short name
    char *lname;        // filter name
} fcs_parameter;

最佳答案

仔细查看您显示的 memset() 和 memcpy() 行。由于您没有显示 i 是如何设置的，也没有显示您的 params 结构数组中的内容，因此有点难以解释。但是，如果您在 PnB 成员中有任何不同的大小，那么我认为您的 memcpy() 偏移量计算是虚假的。

这是一些代码及其输出；您必须根据您的情况对其进行一些调整。最后一部分尝试模拟您的 memset/memcpy 代码，因为没有对您显示的许多变量的解释。它包括一个似乎更有意义的替代解释。

假设您使用的是 C99，您可以将一些函数制作成 static inline。还有其他代码也采用 C99。将其修复为 C89 并不难，但我不会为您做那件事。

#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static uint16_t convert_uint16be(uint8_t const *bytes)
{
    uint16_t r = (bytes[0] << 8) | bytes[1];
    return r;
}

static uint32_t convert_uint32be(uint8_t const *bytes)
{
    uint32_t r = (((((bytes[0] << 8) | bytes[1]) << 8) | bytes[2]) << 8) | bytes[3];
    return r;
}

static void print16(uint8_t const *bytes)
{
  uint16_t r1 = convert_uint16be(bytes);
  int16_t  r2 = convert_uint16be(bytes);
  printf("0x%.2X 0x%.2X = 0x%.4" PRIX16 " = %6" PRId16 "\n", bytes[0], bytes[1], r1, r2);
}

static void print32(uint8_t const *bytes)
{
  uint32_t r1 = convert_uint32be(bytes);
  int32_t  r2 = convert_uint32be(bytes);
  printf("0x%.2X 0x%.2X 0x%.2X 0x%.2X = 0x%.8" PRIX32 " = %11" PRId32 "\n", bytes[0], bytes[1], bytes[2], bytes[3], r1, r2);
}

int main(void)
{
    int PnB = 16; // bitwidth of data stored for a specific channel value
    // for example the data value for sample A is stored in 16 bits.
    char bytes[PnB/8];
    unsigned int lebyte[PnB/8];
    unsigned int cx = 0;
    unsigned int b0, b1;

    /*  |  [0] |  [1] |
     *  | 0xff | 0x03 |
     */
    bytes[0] = 0xff;
    bytes[1] = 0x03;

    // in big endian print out
    b0 = (bytes[0] & 0xff) << 8u;
    b1 = bytes[1] & 0xff;
    cx = b0 | b1;

    printf("%5d = 0x%.4X\n", cx, cx);

    // convert to little endian
    cx = 0;
    for (int i = 0; i < PnB/8; ++i)
    {
        lebyte[i] = (bytes[i] & 0xff) << i*8u;
        cx |= lebyte[i];
    }
    printf("%5d = 0x%.4X\n", cx, cx);

    print16((uint8_t *)bytes);

    uint8_t data[] =
    {
      0x00, 0x00, 0x00, 0x00,
      0x00, 0x00, 0x03, 0xFF,
      0x00, 0x00, 0xFF, 0xFF,
      0x08, 0x08, 0x09, 0xC0,
      0x80, 0x80, 0x90, 0x0C,
      0xFF, 0xFF, 0xED, 0xBC,
    };
    int data_size = sizeof(data) / sizeof(data[0]);

    for (int i = 0; i < data_size; i += 2)
      print16(&data[i]);
    for (int i = 0; i < data_size; i += 4)
      print32(&data[i]);

    {
      struct { int PnB; } params[] = { { 16 }, { 16 }, { 32 }, { 16 }, { 16 }, };
      int num_params = sizeof(params) / sizeof(params[0]);
      uint8_t value[4];
      int i = 0;
      int num = num_params;
      int offset = 0;
      for (int j = 1; j <= num; j++)
      {
        memset(value, '\0', sizeof(char)*params[j-1].PnB/8);
        printf("i = %2d; j = %2d; offset = %2d; calc = %2d; size = %2d\n",
               i, j, offset, ((i*7*2)+(j*params[j-1].PnB/8)), params[j-1].PnB/8);
        /* The calculation works plausibly when all params[n].PnB are the same
         * size, but not otherwise
         */
        memcpy(value, data+((i*7*2)+(j*params[j-1].PnB/8)), params[j-1].PnB/8);
        if (params[j].PnB == 16)
          print16(value);
        else
          print32(value);
        memcpy(value, data+offset, params[j-1].PnB/8);
        if (params[j].PnB == 16)
          print16(value);
        else
          print32(value);
        offset += params[j-1].PnB/8;
      }
    }

    return 0;
}

示例输出:

65283 = 0xFF03
 1023 = 0x03FF
0xFF 0x03 = 0xFF03 =   -253
0x00 0x00 = 0x0000 =      0
0x00 0x00 = 0x0000 =      0
0x00 0x00 = 0x0000 =      0
0x03 0xFF = 0x03FF =   1023
0x00 0x00 = 0x0000 =      0
0xFF 0xFF = 0xFFFF =     -1
0x08 0x08 = 0x0808 =   2056
0x09 0xC0 = 0x09C0 =   2496
0x80 0x80 = 0x8080 = -32640
0x90 0x0C = 0x900C = -28660
0xFF 0xFF = 0xFFFF =     -1
0xED 0xBC = 0xEDBC =  -4676
0x00 0x00 0x00 0x00 = 0x00000000 =           0
0x00 0x00 0x03 0xFF = 0x000003FF =        1023
0x00 0x00 0xFF 0xFF = 0x0000FFFF =       65535
0x08 0x08 0x09 0xC0 = 0x080809C0 =   134744512
0x80 0x80 0x90 0x0C = 0x8080900C = -2139058164
0xFF 0xFF 0xED 0xBC = 0xFFFFEDBC =       -4676
i =  0; j =  1; offset =  0; calc =  2; size =  2
0x00 0x00 = 0x0000 =      0
0x00 0x00 = 0x0000 =      0
i =  0; j =  2; offset =  2; calc =  4; size =  2
0x00 0x00 0x00 0x00 = 0x00000000 =           0
0x00 0x00 0x00 0x00 = 0x00000000 =           0
i =  0; j =  3; offset =  4; calc = 12; size =  4
0x08 0x08 = 0x0808 =   2056
0x00 0x00 = 0x0000 =      0
i =  0; j =  4; offset =  8; calc =  8; size =  2
0x00 0x00 = 0x0000 =      0
0x00 0x00 = 0x0000 =      0
i =  0; j =  5; offset = 10; calc = 10; size =  2
0xFF 0xFF 0x03 0xFF = 0xFFFF03FF =      -64513
0xFF 0xFF 0x03 0xFF = 0xFFFF03FF =      -64513

关于c - 流式细胞术 FCS 文件数据段，线性数据似乎有偏差，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/21613501/

文章推荐： diff - 您可以离线生成补丁并将其应用于 docker 容器吗？

文章推荐： shell - 我可以将文件内容存储到 shell 脚本中的变量吗？

文章推荐： sql - 如何索引具有类型 2 缓慢变化维度的表以获得最佳性能

文章推荐： boost-asio - boost::asio 是否等同于 Poco::Net？

javascript - “私有(private)”细胞
我想知道是否有任何方法可以将单元格“提供”给 Google 电子表格中的某人？我有一个电子表格，任何人都可以编写他们想要的内容，任何人都可以编辑他们想要的内容。我想添加一个功能，使得只有单元格的“作者
c# - EPPLUS AutoFit 细胞
我如何根据一个输入的最大长度设置单元格的自动调整大小。 using (rng = workSheet.Cells["A1:G1"]) { rng.Style.Font.Bold = true;
ios - 展示原型(prototype)细胞
如果我想在 Objective-C 中创建一个 TableView ，每个单元格都以不同的方式定制，我会创建多个原型(prototype)单元格，对其进行定制，然后为每个单元格设置自己的标识符。然后我
ios - 细胞 View 的角半径
我正在使用平板电脑 View 并且我的表格 View 是分组的。我使用单个原型(prototype)单元格。在我的单元格中，我使用 UIView 并且只想将第一个单元格 View 设置为顶部，最后一个
html - 如何定位最后一个 TD 细胞？
如何让特定的 TD 不继承表格的 CSS 样式？我应该如何使最右边的列完全没有任何背景？ table { font-family:Arial, Helvetica, sans-serif;
不同大小的 Python Pandas 细胞
我目前正在开发一个小项目 (Python 3.6.2)，用于根据我从大学网站上抓取的输入自动创建时间表。为了创建时间表和存储约会(以及稍后重新组织它们，以找到“更好”的时间表)，我目前使用 Panda
python - 将一个数组的一个元素变成两个然后删除原来的(细胞 split 模拟)
我想使用 python 数组模拟基本的细胞 split 。我有 u，它是一个数组，定义如下: n=2 #number of elements that can describe each cell N
ipython-notebook - Markdown 细胞 latex 产生不需要的垂直条
渲染 Markdown 单元格时，在 latex 部分之后会显示垂直条。有什么办法可以去掉这些吗？例如，markdown 单元格中的以下代码: $\left[ \begin{array}{cccc}
verilog - 用 Yosys 生成 TIE 细胞？
我正在使用 Yosys综合我的 RTL 设计，其中包括几个文字常量，例如绑定(bind)输出端口，如下代码所示: module my_module ( input a, input b
javascript - 如何在 IgGrid 细胞(Infragistics)中获得正则表达式？
如何在 igGrid 更新中的 igTextEditor 上使用正则表达式？我尝试使用验证选项，但它没有用。 $("#schedulerTable").igGrid({
machine-learning - 细胞/单元之间的 LSTM 连接(不是时间步长)
我的问题是关于如何构建 LSTM 层，例如在 keras 中: keras.layers.LSTM(units,... other options) 这些单位是单个细胞还是细胞状态的维度？我读过有关
html - R Shiny selectedInput inside renderDataTable 细胞
我寻找将 selectedInputs 放入 renderDataTable 单元格的解决方案。我找到了 js 解决方案:https://datatables.net/examples/api/for

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c - 流式细胞术 FCS 文件数据段，线性数据似乎有偏差