gpt4 book ai didi

python - 如何将列表转换为 numpy 数组

转载 作者:行者123 更新时间:2023-12-03 20:52:46 25 4
gpt4 key购买 nike

这是合作链接 https://colab.research.google.com/drive/1wftAvDu_Wu2Y9ahgI1Z1FLciUH5MnSJ9

train_labels = ['政府计划'、'政府计划'、'政府计划'、'政府计划'、'农裁剪保险']

training_label_seq = np.array(label_tokenizer.texts_to_sequences(train_labels))

输出来了:
[list([3]) list([3]) list([3]) ... list([2]) list([5]) list([1])]

预期输出:
[[3] [3] [3] .. [2] [5]...]
num_epochs = 30
history = model.fit(train_padded, training_label_seq, epochs=num_epochs, validation_data=(validation_padded, validation_label_seq))

错误 => ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)

最佳答案

我能够使用以下代码重新创建您的问题 -

重现问题的代码 -

import numpy as np
import tensorflow as tf
print(tf.__version__)
from tensorflow.keras.preprocessing.text import Tokenizer

label_tokenizer = Tokenizer()

# Fit on a text
fit_text = "Tensorflow warriors are awesome people"
label_tokenizer.fit_on_texts(fit_text)

# Training Labels
train_labels = "Tensorflow warriors are great people"
training_label_list = np.array(label_tokenizer.texts_to_sequences(train_labels))

# Print the
print(training_label_list)
print(type(training_label_list))
print(type(training_label_list[0]))

输出 -
2.2.0
[list([9]) list([1]) list([10]) list([5]) list([3]) list([2]) list([11])
list([7]) list([3]) list([6]) list([]) list([6]) list([4]) list([2])
list([2]) list([12]) list([3]) list([2]) list([5]) list([]) list([4])
list([2]) list([1]) list([]) list([4]) list([2]) list([1]) list([])
list([]) list([2]) list([1]) list([4]) list([9]) list([]) list([8])
list([1]) list([3]) list([8]) list([7]) list([1])]
<class 'numpy.ndarray'>
<class 'list'>

解决方案 -
  • 更换 np.arraynp.hstack将解决您的问题。您的 model.fit()现在应该可以正常工作。
  • 否则,如果您正在寻找问题中的预期输出,training_label_list = label_tokenizer.texts_to_sequences(train_labels)会给你一个列表列表。您可以使用 np.array([np.array(i) for i in training_label_list])转换为数组数组。仅当您的列表列表包含具有相同元素数量的列表时,这才有效。

  • np.hstack 代码 - 解决方案中第 1 点的代码。
    import numpy as np
    import tensorflow as tf
    print(tf.__version__)
    from tensorflow.keras.preprocessing.text import Tokenizer

    label_tokenizer = Tokenizer()

    # Fit on a text
    fit_text = "Tensorflow warriors are awesome people"
    label_tokenizer.fit_on_texts(fit_text)

    # Training Labels
    train_labels = "Tensorflow warriors are great people"
    training_label_list = np.hstack(label_tokenizer.texts_to_sequences(train_labels))

    # Print the
    print(training_label_list)
    print(type(training_label_list))
    print(type(training_label_list[0]))

    输出 -
    2.2.0
    [ 9. 1. 10. 4. 2. 3. 11. 7. 2. 5. 5. 6. 3. 3. 12. 2. 3. 4.
    6. 3. 1. 3. 1. 6. 9. 8. 1. 2. 8. 7. 1.]
    <class 'numpy.ndarray'>
    <class 'numpy.float64'>

    有问题的预期输出 - 解决方案中第 2 点的代码。
    import numpy as np
    import tensorflow as tf
    print(tf.__version__)
    from tensorflow.keras.preprocessing.text import Tokenizer

    label_tokenizer = Tokenizer()

    # Fit on a text
    fit_text = "Tensorflow warriors are awesome people"
    label_tokenizer.fit_on_texts(fit_text)

    # Training Labels
    train_labels = "Tensorflow warriors are great people"
    training_label_list = label_tokenizer.texts_to_sequences(train_labels)

    # Print
    print(training_label_list)
    print(type(training_label_list))
    print(type(training_label_list[0]))

    # To convert elements to array
    training_label_list = np.array([np.array(i) for i in training_label_list])

    # Print
    print(training_label_list)
    print(type(training_label_list))
    print(type(training_label_list[0]))

    输出 -
    2.2.0
    [[9], [1], [10], [4], [2], [3], [11], [7], [2], [5], [], [5], [6], [3], [3], [12], [2], [3], [4], [], [6], [3], [1], [], [], [3], [1], [6], [9], [], [8], [1], [2], [8], [7], [1]]
    <class 'list'>
    <class 'list'>
    [array([9]) array([1]) array([10]) array([4]) array([2]) array([3])
    array([11]) array([7]) array([2]) array([5]) array([], dtype=float64)
    array([5]) array([6]) array([3]) array([3]) array([12]) array([2])
    array([3]) array([4]) array([], dtype=float64) array([6]) array([3])
    array([1]) array([], dtype=float64) array([], dtype=float64) array([3])
    array([1]) array([6]) array([9]) array([], dtype=float64) array([8])
    array([1]) array([2]) array([8]) array([7]) array([1])]
    <class 'numpy.ndarray'>
    <class 'numpy.ndarray'>

    希望这能回答你的问题。快乐学习。

    2020 年 2 月 6 日更新 - Anirudh_k07 , 根据我们的讨论,我查看了您的程序,您在 model.fit() 中遇到以下错误使用后 np.hstack用于标签。
    ValueError: Data cardinality is ambiguous:
    x sizes: 41063
    y sizes: 41429
    Please provide data which shares the same first dimension.

    您遇到的这个错误是因为很少有标签具有像 - 这样的特殊字符。和 / .因此在执行 np.hstack(label_tokenizer.texts_to_sequences(train_labels) ,他们正在创建额外的行。您可以打印唯一的列表 train_labels通过使用 print(set(train_labels)) .

    这是我想说的要点 -
    # These Labels have special character
    train_labels = ['Bio-PesticidesandBio-Fertilizers','Old/SenileOrchardRejuvenation']
    training_label_seq = np.hstack(label_tokenizer.texts_to_sequences(train_labels))
    print("Two labels are converted to Five :",training_label_seq)

    # These Labels are fine
    train_labels = ['SoilHealthCard', 'PostHarvestPreservation', 'FertilizerUseandAvailability']
    training_label_seq = np.hstack(label_tokenizer.texts_to_sequences(train_labels))
    print("Three labels are remain three :",training_label_seq)

    输出 -
    Two labels are converted to Five : [17 18 19 51 52]
    Three labels are remain three : [20 36 5]

    所以请做适当的预处理并消除 train_labels 中的这些特殊字符。然后使用 np.hstack(label_tokenizer.texts_to_sequences(train_labels))在标签上。您的 model.fit()在那之后应该可以正常工作。

    希望这能回答你的问题。快乐学习。

    关于python - 如何将列表转换为 numpy 数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62102576/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com