Assignment 1 – XML Data Assessment Summary
Submission Zip files larger than 10MB will not be accepted. If you get a zip larger than 10MB,it means that the zip includes either your program output temp.xml or the input files all_pol.csv and all_tweets.csv. None of these files should be in the submission. You complete the assignment in a group of up to four students whom you choose yourself. If your group has less than four members or you work by your own,you still complete all tasks. External students may choose to collaborate in Skype (or another social media forum). Skype has a white board and enables the sharing of the desktop and these make discussions easy. Submission by one group member is sufficient. Aims of Assignment To start In the DBE-A1 folder,you see some empty files as illustrated by the screenshot below. You will gradually put your answers in these files. Modify the people.txt file to put your group members’ email IDs and their main tasks. Each member must lead the work to at least one task (main task) and contribute to the most of its solution. Each task can have one and only one leading member. Members are encouraged to contribute to tasks lead by other students so that the final solutions are well completed,checked and discussed. The file people.txt is important,especially when a dispute arise from group members. Generally,the marker will give all group members the same mark. However,when the work quality is significant inconsistent among the tasks or in the case of a dispute,group members may be given marks based on how well individual tasks are done. Application and requirements 代写XML Data作业、代做Python,Java课程设计作业、代写DBE-A1留学生作业、代做Java,Python编程语言作业 In this assignment,we conduct data transformation on a small sample (data set) of the twitter data. The data set is about US politicians and their tweets published previously and the data set was downloaded from http://www.cs.washington.edu/research/xmldatasets/www/repository.html. The data set contains two csv files all_pol.csv and all_tweets.csv. The first one contains the information of politicians,and the second one contains the tweets they posted. These two files are in the DBE-work1 folder downloaded together with the assignment specification. An introduction to csv files is given in the extraRecordings tab on the course website. By default,csv files use common ‘,’ as the delimiter between values. However,these two csv files use semi-colon ‘;’ as the delimiter. When you open them in Excel,you will need to use the menu function “data>text to columns” to divide data into different columns. The assignment contains the following tasks and each task is worth 3 marks. The first task is the basis for other tasks and needs to be completed first. Task 1: Choose data from all_tweets.csv and all_pol.csv and convert the chosen data to XML. The detailed requirements of the tasks are given below. Task1. Choose data from all_tweets.csv and all_pol.csv and convert the chosen data to XML A group with 2 people or less chooses 5 tweets and 3 politicians. You convert the chosen tweets into a wellformed XML document manually and save the XML document in tweets.xml. You then convert the chosen politicians into another wellformed XML document manually and save the document in pol.xml. Task2. Design XML DTD for the data merged from XML documents Task 3: Write an XQuery query to join the XML documents Copy and insert the supplied header in header.txt to the front of the file merge.xml. Use Editix or another tool to validate merge.xml (against merge.dtd). If there is any error,you modify the query and try it again.
TableName1 (attribute1,attribute2,attribute3,…) You then compare the XML model and the RDB schema and comment on the advantages and the disadvantages of the XML and the relational models. In your comments,you need to use relevant parts of your chosen data as examples to justify your points. That is,each point of your comment must be followed by an example using your own data. Justification by your own examples is important for the marker to see your understanding. If writing is not cohesive and the points are not backed up with examples,very few marks will be given. Write your answer for this section in the file ‘sqlDBcomm.docx’. Poor presentation and bad writing will get mark deducted. Task 5: Automate Step 1. The input filename and its path “..all_tweets.csv” must be read as a command line argument. The output (a wellformed XML document) must be displayed on the screen. Your program code must not use any hard-coded path and filename. At the beginning,you test your program on the tweets.csv file you created. That is,you use tweets.xml as the command line argument instead of ..all_tweets.csv. For the large input file ..all_tweets.csv,you use the following syntax to direct the output on the screen to the file temp.xml where “>” is the output redirection operator to redirect screen output to the file. java ToXML ..all_tweets.csv > temp.xml
In the case of using java,any special libraries used by your program should be included as jar files in the DBE-A1 folder together with your own code. Sample code has been placed in ToXML.java and ToXML.py for demonstration. You can remove the code if you do not need it.
Plagiarism Extensions Late Penalties 因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email?protected]? 微信:codinghelp (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |