gpt4 book ai didi

IronPython 中质量评估表达式的性能

转载 作者:行者123 更新时间:2023-12-01 05:41:09 24 4
gpt4 key购买 nike

在 C#-4.0 应用程序中,我有一个具有相同长度的强类型 IList 字典 - 一个基于动态强类型列的表。
我希望用户根据将在所有行上聚合的可用列提供一个或多个(python-)表达式。在静态上下文中,它将是:

IDictionary<string, IList> table;
// ...
IList<int> a = table["a"] as IList<int>;
IList<int> b = table["b"] as IList<int>;
double sum = 0;
for (int i = 0; i < n; i++)
sum += (double)a[i] / b[i]; // Expression to sum up

对于 n = 10^7,这在我的笔记本电脑(win7 x64)上运行时间为 0.270 秒。用两个 int 参数替换表达式需要 0.580 秒,非类型化委托(delegate)需要 1.19 秒。
从 IronPython 创建委托(delegate)
IDictionary<string, IList> table;
// ...
var options = new Dictionary<string, object>();
options["DivisionOptions"] = PythonDivisionOptions.New;
var engine = Python.CreateEngine(options);
string expr = "a / b";
Func<int, int, double> f = engine.Execute("lambda a, b : " + expr);

IList<int> a = table["a"] as IList<int>;
IList<int> b = table["b"] as IList<int>;
double sum = 0;
for (int i = 0; i < n; i++)
sum += f(a[i], b[i]);

它需要 3.2 秒( Func<object, object, object> 需要 5.1 秒)- 因子 4 到 5.5。这是我正在做的事情的预期开销吗?有什么可以改进的?

如果我有很多列,上面选择的方法将不再足够。一种解决方案可能是确定每个表达式所需的列并仅使用这些列作为参数。我尝试过的另一个解决方案是使用 ScriptScope 并动态解析列。为此,我定义了一个 RowIterator,它有一个用于事件行的 RowIndex 和一个用于每列的属性。
class RowIterator
{
IList<int> la;
IList<int> lb;

public RowIterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }

public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
}

可以从 IDynamicMetaObjectProvider 创建 ScriptScope,我希望它由 C# 的动态实现 - 但在运行时 engine.CreateScope(IDictionary) 试图被调用,但失败了。
dynamic iterator = new RowIterator(a, b) as dynamic;
var scope = engine.CreateScope(iterator);
var expr = engine.CreateScriptSourceFromString("a / b").Compile();

double sum = 0;
for (int i = 0; i < n; i++)
{
iterator.Index = i;
sum += expr.Execute<double>(scope);
}

接下来,我尝试让 RowIterator 从 DynamicObject 继承并使其成为一个运行示例 - 性能糟糕:158 秒。
class DynamicRowIterator : DynamicObject
{
Dictionary<string, object> members = new Dictionary<string, object>();
IList<int> la;
IList<int> lb;

public DynamicRowIterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}

public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }

public override bool TryGetMember(GetMemberBinder binder, out object result)
{
if (binder.Name == "a") // Why does this happen?
{
result = this.a;
return true;
}
if (binder.Name == "b")
{
result = this.b;
return true;
}
if (base.TryGetMember(binder, out result))
return true;
if (members.TryGetValue(binder.Name, out result))
return true;
return false;
}

public override bool TrySetMember(SetMemberBinder binder, object value)
{
if (base.TrySetMember(binder, value))
return true;
members[binder.Name] = value;
return true;
}
}

我很惊讶 TryGetMember 是用属性名称调用的。从文档中,我预计 TryGetMember 只会为未定义的属性调用。

可能为了获得合理的性能,我需要为我的 RowIterator 实现 IDynamicMetaObjectProvider 以使用动态 CallSites,但找不到适合我的示例。在我的实验中,我不知道如何处理 __builtins__在 BindGetMember 中:
class Iterator : IDynamicMetaObjectProvider
{
IList<int> la;
IList<int> lb;

public Iterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }

public DynamicMetaObject GetMetaObject(Expression parameter)
{
return new MetaObject(parameter, this);
}

private class MetaObject : DynamicMetaObject
{
internal MetaObject(Expression parameter, Iterator self)
: base(parameter, BindingRestrictions.Empty, self) { }

public override DynamicMetaObject BindGetMember(GetMemberBinder binder)
{
switch (binder.Name)
{
case "a":
case "b":
Type type = typeof(Iterator);
string methodName = binder.Name;
Expression[] parameters = new Expression[]
{
Expression.Constant(binder.Name)
};
return new DynamicMetaObject(
Expression.Call(
Expression.Convert(Expression, LimitType),
type.GetMethod(methodName),
parameters),
BindingRestrictions.GetTypeRestriction(Expression, LimitType));
default:
return base.BindGetMember(binder);
}
}
}
}

我确定我上面的代码不是最理想的,至少它还没有处理列的 IDictionary。对于如何改进设计和/或性能的任何建议,我将不胜感激。

最佳答案

我还将 IronPython 的性能与 C# 实现进行了比较。表达式很简单,只需在指定索引处添加两个数组的值。直接访问阵列提供了基线和理论最优值。通过符号字典访问值仍然具有可接受的性能。

第三个测试从一个幼稚的(并且意料之外的)表达式树创建一个委托(delegate),没有任何花哨的东西,如调用端缓存,但它仍然比 IronPython 快。

通过 IronPython 编写表达式脚本花费的时间最多。我的分析器告诉我,大部分时间都花在了 PythonOps.GetVariable、PythonDictionary.TryGetValue 和 PythonOps.TryGetBoundAttr。我认为还有改进的余地。

时间:

  • 直接:00:00:00.0052680
  • 通过字典:00:00:00.5577922
  • 编译委托(delegate):00:00:03.2733377
  • 脚本:00:00:09.0485515

  • 这是代码:
       public static void PythonBenchmark()
    {
    var engine = Python.CreateEngine();

    int iterations = 1000;
    int count = 10000;

    int[] a = Enumerable.Range(0, count).ToArray();
    int[] b = Enumerable.Range(0, count).ToArray();

    Dictionary<string, object> symbols = new Dictionary<string, object> { { "a", a }, { "b", b } };

    Func<int, object> calculate = engine.Execute("lambda i: a[i] + b[i]", engine.CreateScope(symbols));

    var sw = Stopwatch.StartNew();

    int sum = 0;

    for (int iteration = 0; iteration < iterations; iteration++)
    {
    for (int i = 0; i < count; i++)
    {
    sum += a[i] + b[i];
    }
    }

    Console.WriteLine("Direct: " + sw.Elapsed);



    sw.Restart();
    for (int iteration = 0; iteration < iterations; iteration++)
    {
    for (int i = 0; i < count; i++)
    {
    sum += ((int[])symbols["a"])[i] + ((int[])symbols["b"])[i];
    }
    }

    Console.WriteLine("via Dictionary: " + sw.Elapsed);



    var indexExpression = Expression.Parameter(typeof(int), "index");
    var indexerMethod = typeof(IList<int>).GetMethod("get_Item");
    var lookupMethod = typeof(IDictionary<string, object>).GetMethod("get_Item");
    Func<string, Expression> getSymbolExpression = symbol => Expression.Call(Expression.Constant(symbols), lookupMethod, Expression.Constant(symbol));
    var addExpression = Expression.Add(
    Expression.Call(Expression.Convert(getSymbolExpression("a"), typeof(IList<int>)), indexerMethod, indexExpression),
    Expression.Call(Expression.Convert(getSymbolExpression("b"), typeof(IList<int>)), indexerMethod, indexExpression));
    var compiledFunc = Expression.Lambda<Func<int, object>>(Expression.Convert(addExpression, typeof(object)), indexExpression).Compile();

    sw.Restart();
    for (int iteration = 0; iteration < iterations; iteration++)
    {
    for (int i = 0; i < count; i++)
    {
    sum += (int)compiledFunc(i);
    }
    }

    Console.WriteLine("Compiled Delegate: " + sw.Elapsed);



    sw.Restart();
    for (int iteration = 0; iteration < iterations; iteration++)
    {
    for (int i = 0; i < count; i++)
    {
    sum += (int)calculate(i);
    }
    }

    Console.WriteLine("Scripted: " + sw.Elapsed);
    Console.WriteLine(sum); // make sure cannot be optimized away
    }

    关于IronPython 中质量评估表达式的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5379040/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com