c–双重瓶颈,如何改进？

发布时间：2020-12-16 06:46:11 所属栏目：百科来源：网络整理

导读：我需要提高以下代码的性能(Intel Ivy Bridge,x64)： unsigned int delta;unsigned int a[100];unsigned int b[100];...double sum = 0;for(int i = 0; i 100; i++) sum += (double)b[i]/a[i];bool possible = delta = sum; 瓶颈确实是双倍的,使执行时间增加

我需要提高以下代码的性能(Intel Ivy Bridge,x64)：

unsigned int delta;
unsigned int a[100];
unsigned int b[100];

...

double sum = 0;

for(int i = 0; i < 100; i++)
    sum += (double)b[i]/a[i];

bool possible = delta >= sum;

瓶颈确实是双倍的,使执行时间增加了3倍. a [index]将是0到500米之间的任何值. b [index]将从0到500.

问：在对这段代码的两次调用之间如何修改数组a和b？

在每次通话时,唯一的区别是[索引];其中0< = index< 100 b总是一样的.三角洲也不会改变. 由于结果与另一个数字进行比较并存储为布尔值,我绝对需要尽可能高的精度.这就是为什么我使用双倍而不是浮动的原因.如你所知,即使1 / 1b的差异也会返回??错误的值,因为结果是布尔值！

解决方法

以下代码比英特尔酷睿上的原始代码快16倍
i7-3770使用带有“-O3”的Apple LLVM 5.0并且通常更准确(因为
它更有可能增加相似数量的数量,从而避免失败
浮点加法中的位).

由于迭代之间只有一个a [i]发生变化,我们可以缓存所有的
商.我们还可以将添加内容组织到二叉树和缓存中
大部分金额.然后,当一个[i]改变时,我们只需要更新
沿二叉树的单个路径求和.

首先,我们定义一个数组来保存商和它们的总和,我们
初始化它：

// Define number of elements in base arrays.
#define N 100

// Define size needed by adding sizes of each level of tree.
#define P   (100+50+26+14+8+4+2+1)

// Define array.
double q[P];

// Initialize first level with quotients.
for (int i = 0; i < N; ++i)
    q[i] = (double) b[i] / a[i];

// For each other level,form sums from two elements of previous level.
for (int b0 = 0,t = N; 1 < t;)
{
    // t is the number of elements in the current level.
    // b0 is the base for the previous level.
    // b1 is the base for the current level.
    int b1 = b0 + t;

    // If t is odd,pad the level with a zero element.
    if (t & 1)
        q[b1++] = 0;

    // Calculate the size of the current level.
    t = (t+1)/2;

    // Calculate each element in the current level from the previous level.
    for (int i = 0; i < t; ++i)
        q[b1+i] = q[b0+2*i+0] + q[b0+2*i+1];

    // Set the base for the next level.
    b0 = b1;
}

每当元素a [i]发生变化时,我们就会更新它的存储商
更新树：

double C(unsigned int a[],unsigned int b[],double q[],int i)
{
    // Update the stored quotient.
    q[i] = (double) b[i] / a[i];

    // Update the tree,using code similar to above.
    for (int b0 = 0,t = N; 1 < t;)
    {
        int b1 = b0 + t;
        if (t & 1)
            b1++;
        t = (t+1)/2;

        // Calculate the index for the element to update in this level.
        i /= 2;

        // Update the sum that changes in this level.
        q[b1+i] = q[b0+2*i+0] + q[b0+2*i+1];

        b0 = b1;
    }

    // Return the root.
    return q[P-1];
}

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!