如何在Linux上的C(pthread)多线程程序中找到(分段错误)错误？

发布时间：2020-12-14 01:03:42 所属栏目：Linux 来源：网络整理

导读：我正在为 Linux上的(pthread)多线程C程序进行调试. 当线程数很小时,例如1,2,3,它可以很好地工作. 当线程数增加时,我得到了SIGSEGV(分段错误,UNIX信号11). 但是,当我将线程数增加到4以上时,错误有时会出现并且有时会消失. 我用过valgrind == 29655 ==使用信号

我正在为 Linux上的(pthread)多线程C程序进行调试.

当线程数很小时,例如1,2,3,它可以很好地工作.

当线程数增加时,我得到了SIGSEGV(分段错误,UNIX信号11).

但是,当我将线程数增加到4以上时,错误有时会出现并且有时会消失.

我用过valgrind

== 29655 ==使用信号11(SIGSEGV)的默认操作终止进程

== 29655 ==不在地址0xFFFFFFFFFFFFFFF8的映射区域内访问

== 29655 ==在0x3AEB69CA3E：std :: string :: assign(std :: string const&)(在/usr/lib64 / libstdc .so.6.0.8中)

== 29655 == by 0x42A93C：bufferType :: getSenderID(std :: string&)const(boundedBuffer.hpp：29)

似乎我的代码试图读取未分配的内存.
但是,我找不到函数getSenderID()中的任何错误.它只返回Class bufferType中的成员数据字符串.它已被初始化.

我使用GDB和DDD(GDB GUI)来查找错误,该错误也指向那里,但错误有时会消失,因此在GDB中,我无法使用断点捕获它.

此外,我还打印出valgrind指向的函数的值,但它没有用,因为多个线程打印出具有不同顺序的结果并且它们相互交错.每次运行代码时,打印输出都不同.

bufferType在地图中,地图可能有多个条目.每个条目可以由一个线程写入,并由另一个线程同时读取.我使用pthread读/写锁来锁定pthread_rwlock_t.现在,没有SIGSEGV,但程序在某些方面停止而没有进展.我认为这是一个僵局.但是,一个映射条目只能在一个时间点只由一个线程写入,为什么还有死锁？

你能否推荐一些方法来捕获bug,这样我就可以找到它,无论我用多少线程来运行代码.

谢谢

boundedBuffer.hpp的代码如下：

class bufferType
 {
 private:

    string senderID;// who write the buffer

    string recvID; // who should read the buffer

    string arcID; // which arc is updated

    double price; // write node's price 

    double arcValue; // this arc flow value 

    bool   updateFlag ;

    double arcCost;


    int  arcFlowUpBound; 

    //boost::mutex  senderIDMutex; 

    //pthread_mutex_t  senderIDMutex; 

    pthread_rwlock_t       senderIDrwlock;

    pthread_rwlock_t    setUpdateFlaglock;

  public: 
   //typedef boost::mutex::scoped_lock lock;  // synchronous read / write 

   bufferType(){}

   void   getPrice(double& myPrice ) const {myPrice = price;}

   void   getArcValue(double& myArcValue ) const {myArcValue = arcValue;}

   void   setPrice(double& myPrice){price = myPrice;}

   void   setArcValue(double& myValue ){arcValue = myValue;}

   void   readBuffer(double& myPrice,double& myArcValue );

   void   writeBuffer(double& myPrice,double& myArcValue );

   void   getSenderID(string& myID) 

   {
       //boost::mutex::scoped_lock lock(senderIDMutex);
      //pthread_rwlock_rdlock(&senderIDrwlock); 
      cout << "senderID is " << senderID << endl ; 
      myID = senderID;
      //pthread_rwlock_unlock(&senderIDrwlock);
   }
//void   setSenderID(string& myID){ senderID = myID ;}

    void   setSenderID(string& myID)

    { 
        pthread_rwlock_wrlock(&senderIDrwlock); 

            senderID = myID ;

            pthread_rwlock_unlock(&senderIDrwlock);
    }

    void   getRecvID(string& myID) const {myID = recvID;}

    void   setRecvID(string& myID){ recvID = myID ;}

    void   getArcID(string& myID) const {myID  = arcID ;}

    void   setArcID(string& myID){arcID = myID ;}

    void   getUpdateFlag(bool& myFlag)
    {
            myFlag = updateFlag ; 

        if (updateFlag)

           updateFlag  = false; 
    }

//void   setUpdateFlag(bool myFlag){ updateFlag = myFlag ;}

    void   setUpdateFlag(bool myFlag)
    { 
        pthread_rwlock_wrlock(&setUpdateFlaglock);

        updateFlag = myFlag ;

         pthread_rwlock_unlock(&setUpdateFlaglock);

    }

   void   getArcCost(double& myc) const {myc = arcCost; }

   void   setArcCost(double& myc){ arcCost = myc ;}

   void   setArcFlowUpBound(int& myu){ arcFlowUpBound = myu ;}

   int    getArcFlowUpBound(){ return arcFlowUpBound ;}

   //double getLastPrice() const {return price; }

   } ;

从代码中,您可以看到我尝试使用读/写锁来确保不变.
map中的每个条目都有一个如上所述的缓冲区.现在,我遇到了僵局.

解决方法

Access not within mapped region at address 0xFFFFFFFFFFFFFFF8

at 0x3AEB69CA3E: std::string::assign(std::string const&)

这通常意味着您要分配一个为NULL的字符串*,然后递减.例：

#include <string>

int main()
{
  std::string *s = NULL;

  --s;
  s->assign("abc");
}

g++ -g t.cc && valgrind -q ./a.out

...
==20980== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==20980==  Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==20980==    at 0x4EDCBE6: std::string::assign(char const*,unsigned long)
==20980==    by 0x400659: main (/tmp/t.cc:8)

…

因此,请向我们展示boundedBuffer.hpp中的代码(带行号),并考虑该代码如何最终得到一个指向-8的字符串指针.

Would you please recommend some methods to capture the bug so that I can find it no matter how many threads I use to run the code.

在考虑多线程程序时,您必须考虑不变量.您应该使用断言来确认您的不变量是否成立.您应该考虑如何违反这些规定,以及违规行为会导致您观察到的验尸状态.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!