c# - 如何使用 iTextSharp 从 PDF 中正确提取下标/上标？

转载作者：太空狗更新时间：2023-10-30 01:16:56

iTextSharp 可以很好地从 PDF 文档中提取纯文本，但我在处理技术文档中常见的下标/上标文本时遇到了问题。

TextChunk.SameLine() 要求两个 block 具有相同垂直定位“在”同一行上，上标或下标文本不是这种情况.例如，在本文档第 11 页的“COMBUSTION EFFICIENCY”下:

http://www.mass.gov/courts/docs/lawlib/300-399cmr/310cmr7.pdf

预期文本:

monoxide (CO) in flue gas in accordance with the following formula: C.E. = [CO2 /(CO + CO2)]

结果文本:

monoxide (CO) in flue gas in accordance with the following formula: C.E. = [CO /(CO + CO )] 
2 2

我将 SameLine() 移动到 LocationTextExtractionStrategy 并为它读取的私有(private) TextChunk 属性创建了公共(public) getter。这使我能够在我自己的子类中即时调整公差，如下所示:

public class SubSuperStrategy : LocationTextExtractionStrategy {
  public int SameLineOrientationTolerance { get; set; }
  public int SameLineDistanceTolerance { get; set; }

  public override bool SameLine(TextChunk chunk1, TextChunk chunk2) {
    var orientationDelta = Math.Abs(chunk1.OrientationMagnitude
       - chunk2.OrientationMagnitude);
    if(orientationDelta > SameLineOrientationTolerance) return false;
    var distDelta = Math.Abs(chunk1.DistPerpendicular
       - chunk2.DistPerpendicular);
    return (distDelta <= SameLineDistanceTolerance);
    }
}

使用 3 的 SameLineDistanceTolerance，这更正了子/ super block 分配给的行，但文本的相对位置很远:

monoxide (CO) in flue gas in accordance with the following formula:   C.E. = [CO /(CO + CO )] 2 2

有时 block 会插入文本中间的某个位置，有时(如本例)插入到末尾。无论哪种方式，他们都不会在正确的地方结束。我怀疑这可能与字体大小有关，但我对这段代码的理解能力有限。

有没有人找到另一种方法来处理这个问题？

(如果有帮助，我很乐意提交包含我的更改的拉取请求。)

最佳答案

为了正确地提取这些下标和上标，需要一种不同的方法来检查两个文本 block 是否在同一行上。以下类代表了一种这样的方法。

我更熟悉 Java/iText；因此，我首先在 Java 中实现了这种方法，然后才将其转换为 C#/iTextSharp。

一种使用 Java 和 iText 的方法

我正在使用当前的开发分支 iText 5.5.8-SNAPSHOT。

一种识别线的方法

假设文本行是水平的，并且不同行上字形的边界框的垂直延伸不重叠，可以尝试使用 RenderListener 来识别行，如下所示:

public class TextLineFinder implements RenderListener
{
    @Override
    public void beginTextBlock() { }
    @Override
    public void endTextBlock() { }
    @Override
    public void renderImage(ImageRenderInfo renderInfo) { }

    /*
     * @see RenderListener#renderText(TextRenderInfo)
     */
    @Override
    public void renderText(TextRenderInfo renderInfo)
    {
        LineSegment ascentLine = renderInfo.getAscentLine();
        LineSegment descentLine = renderInfo.getDescentLine();
        float[] yCoords = new float[]{
                ascentLine.getStartPoint().get(Vector.I2),
                ascentLine.getEndPoint().get(Vector.I2),
                descentLine.getStartPoint().get(Vector.I2),
                descentLine.getEndPoint().get(Vector.I2)
        };
        Arrays.sort(yCoords);
        addVerticalUseSection(yCoords[0], yCoords[3]);
    }

    /**
     * This method marks the given interval as used.
     */
    void addVerticalUseSection(float from, float to)
    {
        if (to < from)
        {
            float temp = to;
            to = from;
            from = temp;
        }

        int i=0, j=0;
        for (; i<verticalFlips.size(); i++)
        {
            float flip = verticalFlips.get(i);
            if (flip < from)
                continue;

            for (j=i; j<verticalFlips.size(); j++)
            {
                flip = verticalFlips.get(j);
                if (flip < to)
                    continue;
                break;
            }
            break;
        }
        boolean fromOutsideInterval = i%2==0;
        boolean toOutsideInterval = j%2==0;

        while (j-- > i)
            verticalFlips.remove(j);
        if (toOutsideInterval)
            verticalFlips.add(i, to);
        if (fromOutsideInterval)
            verticalFlips.add(i, from);
    }

    final List<Float> verticalFlips = new ArrayList<Float>();
}

( TextLineFinder.java )

此 RenderListener 尝试通过将文本边界框投影到 y 轴上来识别水平文本行。它假定这些投影不会与来自不同行的文本重叠，即使在下标和上标的情况下也是如此。

这个类本质上是 PageVerticalAnalyzer 的简化形式用于 this answer .

按行对文本 block 进行排序

确定了上面的行后，可以调整 iText 的 LocationTextExtractionStrategy 以像这样沿着这些行排序:

public class HorizontalTextExtractionStrategy extends LocationTextExtractionStrategy
{
    public class HorizontalTextChunk extends TextChunk
    {
        public HorizontalTextChunk(String string, Vector startLocation, Vector endLocation, float charSpaceWidth)
        {
            super(string, startLocation, endLocation, charSpaceWidth);
        }

        @Override
        public int compareTo(TextChunk rhs)
        {
            if (rhs instanceof HorizontalTextChunk)
            {
                HorizontalTextChunk horRhs = (HorizontalTextChunk) rhs;
                int rslt = Integer.compare(getLineNumber(), horRhs.getLineNumber());
                if (rslt != 0) return rslt;
                return Float.compare(getStartLocation().get(Vector.I1), rhs.getStartLocation().get(Vector.I1));
            }
            else
                return super.compareTo(rhs);
        }

        @Override
        public boolean sameLine(TextChunk as)
        {
            if (as instanceof HorizontalTextChunk)
            {
                HorizontalTextChunk horAs = (HorizontalTextChunk) as;
                return getLineNumber() == horAs.getLineNumber();
            }
            else
                return super.sameLine(as);
        }

        public int getLineNumber()
        {
            Vector startLocation = getStartLocation();
            float y = startLocation.get(Vector.I2);
            List<Float> flips = textLineFinder.verticalFlips;
            if (flips == null || flips.isEmpty())
                return 0;
            if (y < flips.get(0))
                return flips.size() / 2 + 1;
            for (int i = 1; i < flips.size(); i+=2)
            {
                if (y < flips.get(i))
                {
                    return (1 + flips.size() - i) / 2;
                }
            }
            return 0;
        }
    }

    @Override
    public void renderText(TextRenderInfo renderInfo)
    {
        textLineFinder.renderText(renderInfo);

        LineSegment segment = renderInfo.getBaseline();
        if (renderInfo.getRise() != 0){ // remove the rise from the baseline - we do this because the text from a super/subscript render operations should probably be considered as part of the baseline of the text the super/sub is relative to 
            Matrix riseOffsetTransform = new Matrix(0, -renderInfo.getRise());
            segment = segment.transformBy(riseOffsetTransform);
        }
        TextChunk location = new HorizontalTextChunk(renderInfo.getText(), segment.getStartPoint(), segment.getEndPoint(), renderInfo.getSingleSpaceWidth());
        getLocationalResult().add(location);        
    }

    public HorizontalTextExtractionStrategy() throws NoSuchFieldException, SecurityException
    {
        locationalResultField = LocationTextExtractionStrategy.class.getDeclaredField("locationalResult");
        locationalResultField.setAccessible(true);

        textLineFinder = new TextLineFinder();
    }

    @SuppressWarnings("unchecked")
    List<TextChunk> getLocationalResult()
    {
        try
        {
            return (List<TextChunk>) locationalResultField.get(this);
        }
        catch (IllegalArgumentException | IllegalAccessException e)
        {
            e.printStackTrace();
            throw new RuntimeException(e);
        }
    }

    final Field locationalResultField;
    final TextLineFinder textLineFinder;
}

( HorizontalTextExtractionStrategy.java )

此 TextExtractionStrategy 使用 TextLineFinder 来识别水平文本行，然后使用这些信息对文本 block 进行排序。

当心，此代码使用反射来访问私有(private)父类成员。这可能并非在所有环境中都被允许。在这种情况下，只需复制 LocationTextExtractionStrategy 并直接插入代码即可。

提取文本

现在可以使用这种文本提取策略来提取带有内联上标和下标的文本，如下所示:

String extract(PdfReader reader, int pageNo) throws IOException, NoSuchFieldException, SecurityException
{
    return PdfTextExtractor.getTextFromPage(reader, pageNo, new HorizontalTextExtractionStrategy());
}

(来自 ExtractSuperAndSubInLine.java)

OP 文档第 11 页“COMBUSTION EFFICIENCY”下的示例文本现在被提取如下:

monoxide (CO) in flue gas in accordance with the following formula:   C.E. = [CO 2/(CO + CO 2 )]

使用 C# 和 iTextSharp 的相同方法

来自以 Java 为中心的部分的解释、警告和示例结果仍然适用，这是代码:

我使用的是 iTextSharp 5.5.7。

一种识别线的方法

public class TextLineFinder : IRenderListener
{
    public void BeginTextBlock() { }
    public void EndTextBlock() { }
    public void RenderImage(ImageRenderInfo renderInfo) { }

    public void RenderText(TextRenderInfo renderInfo)
    {
        LineSegment ascentLine = renderInfo.GetAscentLine();
        LineSegment descentLine = renderInfo.GetDescentLine();
        float[] yCoords = new float[]{
            ascentLine.GetStartPoint()[Vector.I2],
            ascentLine.GetEndPoint()[Vector.I2],
            descentLine.GetStartPoint()[Vector.I2],
            descentLine.GetEndPoint()[Vector.I2]
        };
        Array.Sort(yCoords);
        addVerticalUseSection(yCoords[0], yCoords[3]);
    }

    void addVerticalUseSection(float from, float to)
    {
        if (to < from)
        {
            float temp = to;
            to = from;
            from = temp;
        }

        int i=0, j=0;
        for (; i<verticalFlips.Count; i++)
        {
            float flip = verticalFlips[i];
            if (flip < from)
                continue;

            for (j=i; j<verticalFlips.Count; j++)
            {
                flip = verticalFlips[j];
                if (flip < to)
                    continue;
                break;
            }
            break;
        }
        bool fromOutsideInterval = i%2==0;
        bool toOutsideInterval = j%2==0;

        while (j-- > i)
            verticalFlips.RemoveAt(j);
        if (toOutsideInterval)
            verticalFlips.Insert(i, to);
        if (fromOutsideInterval)
            verticalFlips.Insert(i, from);
    }

    public List<float> verticalFlips = new List<float>();
}

按行对文本 block 进行排序

public class HorizontalTextExtractionStrategy : LocationTextExtractionStrategy
{
    public class HorizontalTextChunk : TextChunk
    {
        public HorizontalTextChunk(String stringValue, Vector startLocation, Vector endLocation, float charSpaceWidth, TextLineFinder textLineFinder)
            : base(stringValue, startLocation, endLocation, charSpaceWidth)
        {
            this.textLineFinder = textLineFinder;
        }

        override public int CompareTo(TextChunk rhs)
        {
            if (rhs is HorizontalTextChunk)
            {
                HorizontalTextChunk horRhs = (HorizontalTextChunk) rhs;
                int rslt = CompareInts(getLineNumber(), horRhs.getLineNumber());
                if (rslt != 0) return rslt;
                return CompareFloats(StartLocation[Vector.I1], rhs.StartLocation[Vector.I1]);
            }
            else
                return base.CompareTo(rhs);
        }

        public override bool SameLine(TextChunk a)
        {
            if (a is HorizontalTextChunk)
            {
                HorizontalTextChunk horAs = (HorizontalTextChunk) a;
                return getLineNumber() == horAs.getLineNumber();
            }
            else
                return base.SameLine(a);
        }

        public int getLineNumber()
        {
            Vector startLocation = StartLocation;
            float y = startLocation[Vector.I2];
            List<float> flips = textLineFinder.verticalFlips;
            if (flips == null || flips.Count == 0)
                return 0;
            if (y < flips[0])
                return flips.Count / 2 + 1;
            for (int i = 1; i < flips.Count; i+=2)
            {
                if (y < flips[i])
                {
                    return (1 + flips.Count - i) / 2;
                }
            }
            return 0;
        }

        private static int CompareInts(int int1, int int2){
            return int1 == int2 ? 0 : int1 < int2 ? -1 : 1;
        }

        private static int CompareFloats(float float1, float float2)
        {
            return float1 == float2 ? 0 : float1 < float2 ? -1 : 1;
        }

        TextLineFinder textLineFinder;
    }

    public override void RenderText(TextRenderInfo renderInfo)
    {
        textLineFinder.RenderText(renderInfo);

        LineSegment segment = renderInfo.GetBaseline();
        if (renderInfo.GetRise() != 0){ // remove the rise from the baseline - we do this because the text from a super/subscript render operations should probably be considered as part of the baseline of the text the super/sub is relative to 
            Matrix riseOffsetTransform = new Matrix(0, -renderInfo.GetRise());
            segment = segment.TransformBy(riseOffsetTransform);
        }
        TextChunk location = new HorizontalTextChunk(renderInfo.GetText(), segment.GetStartPoint(), segment.GetEndPoint(), renderInfo.GetSingleSpaceWidth(), textLineFinder);
        getLocationalResult().Add(location);        
    }

    public HorizontalTextExtractionStrategy()
    {
        locationalResultField = typeof(LocationTextExtractionStrategy).GetField("locationalResult", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
        textLineFinder = new TextLineFinder();
    }

    List<TextChunk> getLocationalResult()
    {
        return (List<TextChunk>) locationalResultField.GetValue(this);
    }

    System.Reflection.FieldInfo locationalResultField;
    TextLineFinder textLineFinder;
}

提取文本

    string extract(PdfReader reader, int pageNo)
    {
        return PdfTextExtractor.GetTextFromPage(reader, pageNo, new HorizontalTextExtractionStrategy());
    }

更新:`LocationTextExtractionStrategy` 的变化

在 iText 5.5.9-SNAPSHOT 中提交 53526e4854fcb80c86cbc2e113f7a07401dc9a67(“重构 LocationTextExtractionStrategy...”)到 1ab350beae148be2a4bef5e663b3d67a004ff9f8(“使 TextChunkLocation 成为 CompTextExtractionStrategy<> 类...”)允许像这样的定制而不需要反射(reflection)。

不幸的是，此更改破坏了上面介绍的 HorizontalTextExtractionStrategy。对于这些提交之后的 iText 版本，可以使用以下策略:

public class HorizontalTextExtractionStrategy2 extends LocationTextExtractionStrategy
{
    public static class HorizontalTextChunkLocationStrategy implements TextChunkLocationStrategy
    {
        public HorizontalTextChunkLocationStrategy(TextLineFinder textLineFinder)
        {
            this.textLineFinder = textLineFinder;
        }

        @Override
        public TextChunkLocation createLocation(TextRenderInfo renderInfo, LineSegment baseline)
        {
            return new HorizontalTextChunkLocation(baseline.getStartPoint(), baseline.getEndPoint(), renderInfo.getSingleSpaceWidth());
        }

        final TextLineFinder textLineFinder;

        public class HorizontalTextChunkLocation implements TextChunkLocation
        {
            /** the starting location of the chunk */
            private final Vector startLocation;
            /** the ending location of the chunk */
            private final Vector endLocation;
            /** unit vector in the orientation of the chunk */
            private final Vector orientationVector;
            /** the orientation as a scalar for quick sorting */
            private final int orientationMagnitude;
            /** perpendicular distance to the orientation unit vector (i.e. the Y position in an unrotated coordinate system)
             * we round to the nearest integer to handle the fuzziness of comparing floats */
            private final int distPerpendicular;
            /** distance of the start of the chunk parallel to the orientation unit vector (i.e. the X position in an unrotated coordinate system) */
            private final float distParallelStart;
            /** distance of the end of the chunk parallel to the orientation unit vector (i.e. the X position in an unrotated coordinate system) */
            private final float distParallelEnd;
            /** the width of a single space character in the font of the chunk */
            private final float charSpaceWidth;

            public HorizontalTextChunkLocation(Vector startLocation, Vector endLocation, float charSpaceWidth)
            {
                this.startLocation = startLocation;
                this.endLocation = endLocation;
                this.charSpaceWidth = charSpaceWidth;

                Vector oVector = endLocation.subtract(startLocation);
                if (oVector.length() == 0)
                {
                    oVector = new Vector(1, 0, 0);
                }
                orientationVector = oVector.normalize();
                orientationMagnitude = (int)(Math.atan2(orientationVector.get(Vector.I2), orientationVector.get(Vector.I1))*1000);

                // see http://mathworld.wolfram.com/Point-LineDistance2-Dimensional.html
                // the two vectors we are crossing are in the same plane, so the result will be purely
                // in the z-axis (out of plane) direction, so we just take the I3 component of the result
                Vector origin = new Vector(0,0,1);
                distPerpendicular = (int)(startLocation.subtract(origin)).cross(orientationVector).get(Vector.I3);

                distParallelStart = orientationVector.dot(startLocation);
                distParallelEnd = orientationVector.dot(endLocation);
            }

            public int orientationMagnitude()   {   return orientationMagnitude;    }
            public int distPerpendicular()      {   return distPerpendicular;       }
            public float distParallelStart()    {   return distParallelStart;       }
            public float distParallelEnd()      {   return distParallelEnd;         }
            public Vector getStartLocation()    {   return startLocation;           }
            public Vector getEndLocation()      {   return endLocation;             }
            public float getCharSpaceWidth()    {   return charSpaceWidth;          }

            /**
             * @param as the location to compare to
             * @return true is this location is on the the same line as the other
             */
            public boolean sameLine(TextChunkLocation as)
            {
                if (as instanceof HorizontalTextChunkLocation)
                {
                    HorizontalTextChunkLocation horAs = (HorizontalTextChunkLocation) as;
                    return getLineNumber() == horAs.getLineNumber();
                }
                else
                    return orientationMagnitude() == as.orientationMagnitude() && distPerpendicular() == as.distPerpendicular();
            }

            /**
             * Computes the distance between the end of 'other' and the beginning of this chunk
             * in the direction of this chunk's orientation vector.  Note that it's a bad idea
             * to call this for chunks that aren't on the same line and orientation, but we don't
             * explicitly check for that condition for performance reasons.
             * @param other
             * @return the number of spaces between the end of 'other' and the beginning of this chunk
             */
            public float distanceFromEndOf(TextChunkLocation other)
            {
                float distance = distParallelStart() - other.distParallelEnd();
                return distance;
            }

            public boolean isAtWordBoundary(TextChunkLocation previous)
            {
                /**
                 * Here we handle a very specific case which in PDF may look like:
                 * -.232 Tc [( P)-226.2(r)-231.8(e)-230.8(f)-238(a)-238.9(c)-228.9(e)]TJ
                 * The font's charSpace width is 0.232 and it's compensated with charSpacing of 0.232.
                 * And a resultant TextChunk.charSpaceWidth comes to TextChunk constructor as 0.
                 * In this case every chunk is considered as a word boundary and space is added.
                 * We should consider charSpaceWidth equal (or close) to zero as a no-space.
                 */
                if (getCharSpaceWidth() < 0.1f)
                    return false;

                float dist = distanceFromEndOf(previous);

                return dist < -getCharSpaceWidth() || dist > getCharSpaceWidth()/2.0f;
            }

            public int getLineNumber()
            {
                Vector startLocation = getStartLocation();
                float y = startLocation.get(Vector.I2);
                List<Float> flips = textLineFinder.verticalFlips;
                if (flips == null || flips.isEmpty())
                    return 0;
                if (y < flips.get(0))
                    return flips.size() / 2 + 1;
                for (int i = 1; i < flips.size(); i+=2)
                {
                    if (y < flips.get(i))
                    {
                        return (1 + flips.size() - i) / 2;
                    }
                }
                return 0;
            }

            @Override
            public int compareTo(TextChunkLocation rhs)
            {
                if (rhs instanceof HorizontalTextChunkLocation)
                {
                    HorizontalTextChunkLocation horRhs = (HorizontalTextChunkLocation) rhs;
                    int rslt = Integer.compare(getLineNumber(), horRhs.getLineNumber());
                    if (rslt != 0) return rslt;
                    return Float.compare(getStartLocation().get(Vector.I1), rhs.getStartLocation().get(Vector.I1));
                }
                else
                {
                    int rslt;
                    rslt = Integer.compare(orientationMagnitude(), rhs.orientationMagnitude());
                    if (rslt != 0) return rslt;

                    rslt = Integer.compare(distPerpendicular(), rhs.distPerpendicular());
                    if (rslt != 0) return rslt;

                    return Float.compare(distParallelStart(), rhs.distParallelStart());
                }
            }
        }
    }

    @Override
    public void renderText(TextRenderInfo renderInfo)
    {
        textLineFinder.renderText(renderInfo);
        super.renderText(renderInfo);
    }

    public HorizontalTextExtractionStrategy2() throws NoSuchFieldException, SecurityException
    {
        this(new TextLineFinder());
    }

    public HorizontalTextExtractionStrategy2(TextLineFinder textLineFinder) throws NoSuchFieldException, SecurityException
    {
        super(new HorizontalTextChunkLocationStrategy(textLineFinder));

        this.textLineFinder = textLineFinder;
    }

    final TextLineFinder textLineFinder;
}

( HorizontalTextExtractionStrategy2.java )

关于c# - 如何使用 iTextSharp 从 PDF 中正确提取下标/上标？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33492792/

文章推荐： Python/瓶/MongoDB : Unsupported response type:

文章推荐： python - Python 中基于 Web 的聊天服务器的教程

文章推荐： Python 语音比较

文章推荐： python - 将标准输出重定向到 Python 中的记录器

javascript - 使用 WebScriptEndpoint 使用 javascript 使用 WCF 服务
我在网上搜索但没有找到任何合适的文章解释如何使用 javascript 使用 WCF 服务，尤其是 WebScriptEndpoint。任何人都可以对此给出任何指导吗？谢谢最佳答案这是一篇关于
c - 没有结果!!使用 fork() 使用 dup2 使用 2 个管道运行 execlp()
我正在编写一个将运行 Linux 命令的 C 程序，例如: cat/etc/passwd | grep 列表 |剪切-c 1-5 我没有任何结果 *这里 parent 等待第一个 child (chi
python - 处理文件上传，使用 Pillow 调整大小，使用 SQLAlchemy 存储，使用 Flask 提供文件
所以我正在尝试处理文件上传，然后将该文件作为二进制文件存储到数据库中。在我存储它之后，我尝试在给定的 URL 上提供文件。我似乎找不到适合这里的方法。我需要使用数据库，因为我使用 Google 应用引
excel - 使用 IF 使用 VBA 在单元格中添加公式的问题
我正在尝试制作一个宏，将下面的公式添加到单元格中，然后将其拖到整个列中并在 H 列中复制相同的公式我想在 F 和 H 列中输入公式的数据 Range("F1").formula = "=IF(ISE
使用 OperatorPrecedenceParser 使用 FParsec 解析函数应用程序？
问题类似于this one ，但我想使用 OperatorPrecedenceParser 解析带有函数应用程序的表达式在 FParsec . 这是我的 AST: type Expression =
sql - 使用 sequelize 使用 where 查询编码计数
我想通过使用 sequelize 和 node.js 将这个查询更改为代码取决于在哪里 select COUNT(gender) as genderCount from customers where
bash - 使用 “let”分配Bash失败，使用 “/”
我正在使用GNU bash，版本5.0.3(1)-发行版(x86_64-pc-linux-gnu)，我想知道为什么简单的赋值语句会出现语法错误: #/bin/bash var1=/tmp
javascript - 使用 JavaScript 使用 FOR OF 数组循环时出现错误？
这里，为什么我的代码在 IE 中不起作用。我的代码适用于所有浏览器。没有问题。但是当我在 IE 上运行我的项目时，它发现错误。而且我的 jquery 类和 insertadjacentHTMl 也不
javascript - 使用 javascript 使用 for 属性更改表单标签内容
我正在尝试更改标签的innerHTML。我无权访问该表单，因此无法编辑 HTML。标签具有的唯一标识符是“for”属性。这是输入和标签的结构:
javascript - 使用 jquery 使用 .on() 将事件附加到页面上的动态插入按钮
我有一个页面，我可以在其中返回用户帖子，可以使用一些 jquery 代码对这些帖子进行即时评论，在发布新评论后，我在帖子下插入新评论以及删除按钮。问题是 Delete 按钮在新插入的元素上不起作用，
使用 awk 使用 sha1sum 进行散列
我有一个大约有 20 列的“管道分隔”文件。我只想使用 sha1sum 散列第一列，它是一个数字，如帐号，并按原样返回其余列。使用 awk 或 sed 执行此操作的最佳方法是什么？ Accounti
mysql - 使用 insert into 使用 mysql
我需要将以下内容插入到我的表中...我的用户表有五列 id、用户名、密码、名称、条目。 (我还没有提交任何东西到条目中，我稍后会使用 php 来做)但由于某种原因我不断收到这个错误:#1054 - U
jquery - 将输入字段值修剪为仅字母数字字符/使用 .使用 jQuery
所以我试图有一个输入字段，我可以在其中输入任何字符，但然后将输入的值小写，删除任何非字母数字字符，留下“。”而不是空格。例如，如果我输入: 地球的 70% 是水，-!*#$^^ & 30% 土地输
javascript - 使用 .innerHTML 使用 DOM
我正在尝试做一些我认为非常简单的事情，但出于某种原因我没有得到想要的结果？我是 javascript 的新手，但对 java 有经验，所以我相信我没有使用某种正确的规则。这是一个获取输入值、检查选择
php - 使用 angularjs 使用 where 子句从数据库获取数据
我想使用 angularjs 从 mysql 数据库加载数据。这就是应用程序的工作原理；用户登录，他们的用户名存储在 cookie 中。该用户名显示在主页上我想获取这个值并通过 angularjs
ios - 使用 UITableViewCell 使用 AutoLayout
我正在使用 autoLayout，我想在 UITableViewCell 上放置一个 UIlabel，它应该始终位于单元格的右侧和右侧的中心。这就是我想要实现的目标所以在这里你可以看到我正在谈论的
mysql - 使用 ElasticSearch 使用 or 和运算符搜索多个字段
我需要与 MySql 等效的 elasticsearch 查询。我的 sql 查询: SELECT DISTINCT t.product_id AS id FROM tbl_sup_price t
ios - 使用 Swift 使用 JSON
我正在实现代码以使用 JSON。 func setup() { if let flickrURL = NSURL(string: "https://api.flickr.com/
javascript - 使用 JavaScript 使用 for 循环声明变量
我尝试使用for循环声明变量，然后测试cols和rols是否相同。如果是，它将运行递归函数。但是，我在 javascript 中执行 do 时遇到问题。有人可以帮忙吗？现在，在比较 col.1 和
jquery - 使用 :after 使用 jquery 更改样式
我举了一个我正在处理的问题的简短示例。 HTML代码: 1 2 3 CSS 代码: .BB a:hover{ color: #000; } .BB > li:after {

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c# - 如何使用 iTextSharp 从 PDF 中正确提取下标/上标？

一种使用 Java 和 iText 的方法

一种识别线的方法

按行对文本 block 进行排序

提取文本

使用 C# 和 iTextSharp 的相同方法

一种识别线的方法

按行对文本 block 进行排序

提取文本

更新:`LocationTextExtractionStrategy` 的变化

首页

博学

6Ren·AI

商城

c# - 如何使用 iTextSharp 从 PDF 中正确提取下标/上标？

一种使用 Java 和 iText 的方法

一种识别线的方法

按行对文本 block 进行排序

提取文本

使用 C# 和 iTextSharp 的相同方法

一种识别线的方法

按行对文本 block 进行排序

提取文本

更新:LocationTextExtractionStrategy 的变化

更新:`LocationTextExtractionStrategy` 的变化