c++ - 重复的 Windows 加密服务提供程序导致 Python w/Pycrypto-6ren

c++ - 重复的 Windows 加密服务提供程序导致 Python w/Pycrypto

转载作者：太空狗更新时间：2023-10-29 23:06:29

编辑和更新

2013 年 3 月 24 日:
在转换为 utf-16 并在命中任何“e”或“m”字节之前停止之后，我的 Python 输出散列现在与 c++ 的散列相匹配。但是解密结果不匹配。我知道我的 SHA1 散列是 20 字节 = 160 位，而 RC4 key 的长度可以从 40 到 2048 位不等，所以我可能需要模仿 WinCrypt 中正在进行的一些默认加盐。 CryptGetKeyParam KP_LENGTH 或 KP_SALT

2013 年 3 月 24 日:
CryptGetKeyParam KP_LENGTH 告诉我我的 key 长度是 128 位。我正在为它提供 160 位哈希值。所以也许它只是丢弃了最后 32 位……或 4 个字节。立即测试。

2013 年 3 月 24 日:是的，就是这样。如果我在 python 中丢弃我的 SHA1 哈希的最后 4 个字节......我得到相同的解密结果。

快速信息:

我有一个 C++ 程序来解密数据 block 。它使用 Windows Crytographic Service Provider，因此只能在 Windows 上运行。我希望它能与其他平台一起使用。

方法概述:

在 Windows 加密 API 中字节的 ASCII 编码密码被转换为宽字符表示，然后使用 SHA1 进行散列以生成 RC4 流密码的 key 。

Python 中的 PyCrypto ASCII 编码的字节字符串被解码为 python 字符串。它根据经验观察到的字节被截断，这导致 mbctowcs 在 C++ 中停止转换。然后将这个截断的字符串编码为 utf-16，有效地在字符之间填充 0x00 字节。这个新的截断、填充的字节字符串被传递给 SHA1 哈希，摘要的前 128 位被传递给 PyCrypto RC4 对象。

问题 [已解决]
我似乎无法使用带有 PyCrypto 的 Python 3.x 获得相同的结果

C++ 代码框架:

HCRYPTPROV hProv      = 0x00;
HCRYPTHASH hHash      = 0x00;
HCRYPTKEY  hKey       = 0x00;
wchar_t    sBuf[256]  = {0};

CryptAcquireContextW(&hProv, L"FileContainer", L"Microsoft Enhanced RSA and AES Cryptographic Provider", 0x18u, 0);

CryptCreateHash(hProv, 0x8004u, 0, 0, &hHash);
//0x8004u is SHA1 flag

int len = mbstowcs(sBuf, iRec->desc, sizeof(sBuf));
//iRec is my "Record" class
//iRec->desc is 33 bytes within header of my encrypted file
//this will be used to create the hash key. (So this is the password)

CryptHashData(hHash, (const BYTE*)sBuf, len, 0);

CryptDeriveKey(hProv, 0x6801, hHash, 0, &hKey);

DWORD dataLen = iRec->compLen;  
//iRec->compLen is the length of encrypted datablock
//it's also compressed that's why it's called compLen

CryptDecrypt(hKey, 0, 0, 0, (BYTE*)iRec->decrypt, &dataLen);
// iRec is my record that i'm decrypting
// iRec->decrypt is where I store the decrypted data
//&dataLen is how long the encrypted data block is.
//I get this from file header info

Python 代码框架:

from Crypto.Cipher import ARC4
from Crypto.Hash import SHA

#this is the Decipher method from my record class
def Decipher(self):

    #get string representation of 33byte password
    key_string= self.desc.decode('ASCII')

    #so far, these characters fail, possibly others but
    #for now I will make it a list
    stop_chars = ['e','m']

    #slice off anything beyond where mbstowcs will stop
    for char in stop_chars:
        wc_stop = key_string.find(char)
        if wc_stop != -1:
            #slice operation
            key_string = key_string[:wc_stop]

    #make "wide character"
    #this is equivalent to padding bytes with 0x00

    #Slice off the two byte "Byte Order Mark" 0xff 0xfe 
    wc_byte_string = key_string.encode('utf-16')[2:]

    #slice off the trailing 0x00
    wc_byte_string = wc_byte_string[:len(wc_byte_string)-1] 

    #hash the "wchar" byte string
    #this is the equivalent to sBuf in c++ code above
    #as determined by writing sBuf to file in tests
    my_key = SHA.new(wc_byte_string).digest()

    #create a PyCrypto cipher object
    RC4_Cipher = ARC4.new(my_key[:16])

    #store the decrypted data..these results NOW MATCH
    self.decrypt = RC4_Cipher.decrypt(self.datablock)

怀疑[编辑:确认]原因
1. 密码的 mbstowcs 转换导致被馈送到 SHA1 哈希的“原始数据”在 python 和 c++ 中是不一样的。 mbstowcs 在 0x65 和 0x6D 字节处停止转换。原始数据仅以原始 33 字节密码的一部分的 wide_char 编码结尾。

RC4 可以有可变长度的 key 。在 Enhanced Win Crypt Sevice 提供程序中，默认长度为 128 位。不指定 key 长度是采用“原始数据”的 160 位 SHA1 摘要的前 128 位

我是如何调查的编辑:根据我自己的实验和@RolandSmith 的建议，我现在知道我的问题之一是 mbctowcs 的行为方式出乎我的意料。它似乎停止在“e”(0x65)和“m”(0x6d)(可能是其他)上写入 sBuf。因此，我的描述中的密码“Monkey”(Ascii 编码字节)在 sBuf 中看起来像“M o n k”，因为 mbstowcs 在 e 处停止，并根据我系统上的 2 字节 wchar typedef 在字节之间放置 0x00。我通过将转换结果写入文本文件找到了这一点。

BYTE pbHash[256];  //buffer we will store the hash digest in 
DWORD dwHashLen;  //store the length of the hash
DWORD dwCount;
dwCount = sizeof(DWORD);  //how big is a dword on this system?


//see above "len" is the return value from mbstowcs that tells how
//many multibyte characters were converted from the original
//iRec->desc an placed into sBuf.  In some cases it's 3, 7, 9
//and always seems to stop on "e" or "m"

fstream outFile4("C:/desc_mbstowcs.txt", ios::out | ios::trunc | ios::binary);
outFile4.write((const CHAR*)sBuf, int(len));
outFile4.close();

//now get the hash size from CryptGetHashParam
//an get the acutal hash from the hash object hHash
//write it to a file.
if(CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE *)&dwHashLen, &dwCount, 0)) {
  if(CryptGetHashParam(hHash, 0x0002, pbHash, &dwHashLen,0)){

    fstream outFile3("C:/test_hash.txt", ios::out | ios::trunc | ios::binary);
    outFile3.write((const CHAR*)pbHash, int(dwHashLen));
    outFile3.close();
  }
}

引用资料:
宽字符会导致问题，具体取决于环境定义
Difference in Windows Cryptography Service between VC++ 6.0 and VS 2008

将 utf-8 字符串转换为 utf-16 字符串
Python - converting wide-char strings from a binary file to Python unicode strings

PyCrypto RC4 示例
https://www.dlitz.net/software/pycrypto/api/current/Crypto.Cipher.ARC4-module.html

Hashing a string with Sha256

http://msdn.microsoft.com/en-us/library/windows/desktop/aa379916(v=vs.85).aspx

http://msdn.microsoft.com/en-us/library/windows/desktop/aa375599(v=vs.85).aspx

最佳答案

您可以使用一个小测试程序(C 语言)测试wchar_t 的大小:

#include <stdio.h> /* for printf */
#include <stddef.h> /* for wchar_t */

int main(int argc, char *argv[]) {
    printf("The size of wchar_t is %ld bytes.\n", sizeof(wchar_t));
    return 0;
}

您还可以在 C++ 代码中使用 printf() 调用来编写例如iRec->desc 和 sbuf 中的散列结果显示在屏幕上(如果您可以从终端运行 C++ 程序)。否则使用 fprintf() 将它们转储到文件中。

为了更好地模仿 C++ 程序的行为，您甚至可以使用 ctypes在您的 Python 代码中调用 mbstowcs()。

编辑:您写道:

One problem is definitely with mbctowcs. It seems that it's transferring an unpredictable (to me) number of bytes into my buffer to be hashed.

请记住，mbctowcs 返回转换后的宽字符数。换句话说，多字节编码中的 33 字节缓冲区可以包含从 5(UTF-8 6 字节序列)到 33 个字符的任何内容，具体取决于所使用的编码。

Edit2:您正在使用 0 作为 CryptDeriveKey 的 dwFlags 参数。根据其documentation ，高 16 位应包含 key 长度。您应该检查 CryptDeriveKey 的返回值以查看调用是否成功。

Edit3:您可以在 Python 中测试 mbctowcs(我在这里使用 IPython。):

In [1]: from ctypes import *

In [2]: libc = CDLL('libc.so.7')

In [3]: monkey = c_char_p(u'Monkey')

In [4]: test = c_char_p(u'This is a test')

In [5]: wo = create_unicode_buffer(256)

In [6]: nref = c_size_t(250)

In [7]: libc.mbstowcs(wo, monkey, nref)
Out[7]: 6

In [8]: print wo.value
Monkey

In [9]: libc.mbstowcs(wo, test, nref)
Out[9]: 14

In [10]: print wo.value
This is a test

请注意，在 Windows 中，您可能应该使用 libc = cdll.msvcrt 而不是 libc = CDLL('libc.so.7')。

关于c++ - 重复的 Windows 加密服务提供程序导致 Python w/Pycrypto，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/15537775/

文章推荐： c++ - 高效流滑动窗口处理的建议

文章推荐： c# - COM接口(interface)修改突然开始导致异常

解读邮箱正则表达式：^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$
验证邮箱的正则表达式 var ePattern = /^([A-Za-z0-9_\-\.])+\@([A-Za-z0-9_\-\.])+\.([A-Za-z]{2,4})$/; 或者
java - 使用正则表达式拆分字符串\w\w*?\w+?
我正在学习正则表达式并认为我开始掌握了。但是…… 我试图拆分一个字符串，我需要帮助来理解这样一个简单的事情: String input = "abcde"; System.out.println("[
c++ - 'W' 打印出 W， "W"打印出 $。为什么？
我是初学者。我不知道为什么？我正在使用 Code::Blocks。请阅读以下代码: 如果 q=' W '，则打印出 W。如果 q=" W "，则打印出 $。最佳答案文字 "W" 不是单个 cha
regex - REGEXP_EXTRACT(word,r'(\w\w\'\w\w)' ) 中的 r 是什么意思
我在 BigQuery Reference 或 re2 wiki 中都找不到答案。在 BigQuery Reference 中 Regex 部分的所有示例中，每个 regex 之前都有一个“r”，但
java - 当 "X px"仅被按下一次时使 block /角色移动 "W or w"，并且当 "W and w"被按住时不继续行走
当我按“W 或 w”但仅一次时，我想让我的矩形/字符移动“X px”。按住“W”和“w”时不继续移动。我尝试使用一个变量创建一个“Key Released”函数，该变量在按下“W 或 w”时会发生变化
ruby - Ruby 中的 %w{} 和 %W{} 大写和小写百分比 W 数组文字有什么区别？
%w[ ] Non-interpolated Array of words, separated by whitespace %W[ ] Interpolated Array of words
Vim 'w' 表现得像 'W'
我使用 vim。在我曾经使用过的每台机器上，“w”都尊重标点符号。如果我按“w”，我会前进到一个词的结尾。如果是句点分隔词，我将移至下一个句点。然而，在特定的 vim 安装中，'w' 被解释为 '
正则表达式差异 : (\w+)? 和 (\w*)
(\w+)?有什么区别吗和 (\w*)在正则表达式中？似乎是一样的，不是吗？最佳答案 (\w+)?和 (\w*)两者匹配相同(0..+inf 单词字符) 但是，有一点不同: 在第一种情况下，如果正
ruby %w(...) 与 %w[...]
在 Ruby 中 %w(don matt james) 和 %w[don matt james] 有区别吗？使用 Ruby 控制台，它们都输出一个数组，每个单词作为一个元素。我很好奇为什么有多种方法
context-free-grammar - 是 { w | w <> w^R } 在字母表 {0,1} 上是一种上下文无关的语言？
我真的很想帮助您决定字母表中所有单词的语言是否{0,1}不能从两边以同样的方式读取，{ w | w <> wR } , 是一种上下文无关语言(即可以转化为特定的语法规则)。我试图通过抽水引理证明它不
xml - 如何使用 xslt 2.0 检查所有具有我当前节点的 w:r/w:t 子节点的后代::w:p？
这是我的 Xml 文档(小片段)。
vim - vim中有没有办法制作:W to do the same thing as :w?
:q 和 :Q 也是如此。我几乎总是不会足够快地放弃转变，看到 :Q 和 :W 无论如何都没有被使用，我认为让它们像小写字母一样做会很好。最佳答案黑客是通过 :cmap或 :cabb ，但这些都有
javascript - 正则表达式 -/\w\b\w/
我对/\w\b\w/感到困惑。我认为它应该匹配“we we”中的“e w”，因为: \w 是单词字符，即“e” \b 是单词 broundary，它是 ""(空格) \w 是另一个词是“w” 所以匹配
linux - 这两个命令之间的区别(w & w/out "")以及为什么？
在 Linux 中，我的目录中有一个名为 test2 的文件，该文件是我使用 touch 命令创建的。当我运行命令时 find . –name “*test*” -ls 它不会给我错误，但是当我运行
ruby - 尝试使用匹配器/\w/and/\W/将句子拆分为单词和分隔符
我想把一个句子分成单词和单词之间的部分(我称之为定界符)。 sentence = "First-tea,-then-coffee!" => "First-tea,-then-coffee!" word
ruby - %w 和 %W 有什么区别
我正在查看 Ruby 的文档。我对使用 %w() 还是 %W() 感到困惑(后面的 W 是大写的)。两者有什么区别？你能给我指点一些文档吗？最佳答案当大写时，数组由插入的字符串构成，就像在双引号字
ruby 数组 : %w vs %W
有什么区别？最佳答案 %w 引用像单引号 ''(没有变量插值，转义序列更少)，而 %W 引用像双引号 ""。 irb(main):001:0> foo="hello" => "hello" irb(
xml - 我想替换 element with a new xml element in a Open XML document using XQuery
这是运行 XQueries 之前的 XML 文档示例: ... 1.7 ****
c++ - clang vs gcc 运行时差异 : c++ class template built w clang crashes w/o copy constructor, 内置 w gcc 使用复制构造函数崩溃
除非我为 TableTypeCarrier 模板类包含一个复制构造函数，否则使用 clang(但不是 gcc)构建时，以下代码会在运行时崩溃吗？如果我包含该复制构造函数，为什么我在使用 gcc 构建时
regex - 为什么正则表达式/[\w\W] + x/i运行起来会非常慢？
尝试: time perl -E '$x="a" x 100000; $x =~ /[\w\W]+x/i' 将运行很长时间(在我的笔记本上20秒)。没有/i，例如 time perl -E '$x=

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城