Documents

Review Doc

Description
gdd
Categories
Published
of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Automated Scoring System for Essays Members : P.Aruna, R.Dhivya Priya, R.Divya Harshini Project Guide : Dr K S Eashwara Kumar SUMMARY: The objective of    automated essay scoring system is to assign scores to essays written in an educational setting. It is a method of educational assessment and an application of natural language  processing. This system makes use of word based document vector construction method as it is versatile .  Likewise we adopt the Content Vector Analysis (CVA) in preference to Latent Semantic Analysis (LSA). This is because in LSA a higher order algorithmic complexity of O(n^2k^3) is involved in SVD and words are necessarily required to exhibit normal distribution for a good performance. CVA can be used in this case as the distribution of words in corporate datasets can be expected to be of random  nature only. Our system will use Model Based approach because memory based approach requires a large training dataset. Also, this is popular in text classification problems where very high-dimensional spaces are the norm. In comparison to the Memory based approach that needs huge space, storage and training requirements, our model based approach of calculating the deviation of the essay examined from the ideal scored essays  is better preferred. The system first evaluates text complexity features, such as the number of characters in the document(Chars),number of words in the document(words),number of different words (Diffwds) fourth root of the number of words in the document, as suggested by the Page(Rootwds), number of sentences in the document(Sents),average word length(Wordlen=Chars/Words),average sentence length (Sentlen=Words/Sents) and number of words longer than five characters(BW5). Each feature has its own use. For example, the number of words represents the length of the essay since the length requirement is say 250-300 words. This feature can check the empty essay or essay which is ridiculously short that it cannot be processed and rejects it immediately. Otherwise a score can be assigned accordingly. Once the essay passes the feature extraction process, the next step is to check the essay for any spelling mistakes. The count of the number of spelling mistakes has to be recorded and the errors must be autocorrected. Then the essay must be checked for grammatical mistakes and based on it a component of score must be assigned. The next step is to remove the stop words and the essay is subjected to stemming. The number of times the word occurs in a document (tf) and the number of documents containing the word (df) are calculated. The inverse document frequency (idf) is calculated using df and the tfidf weight is computed. Content Vector Analysis is then carried out and a score is assigned to the test essay based on its deviation from the reference essay. The individual raw scores, namely from the feature extraction process, the grammar/spell check process and the content vector analysis process, are taken and weights are assigned for each component. The scores are then subjected to regression techniques using which the final score is calculated. In this manner, we can ensure that the essays are graded uniformly with equity and less fatigue. STATUS:    Modules: GUI: The user is allowed to key in the test essay to be evaluated. On clicking the ‘SUBMIT’ button the essay will be recorded in a file. Special Case: If the user by any chance happens to click the ‘SUBMIT’ button without typing the essay,a prompt will appear asking him to key in the essay and then press the submit button for the first time alone. The next time the user presses the submit button,it will be counted as no answer and his score will be 0. Surface Feature Extraction: The text complexity features are extracted and they are compared with the requirements specified.Each feature has its own use. For example,the number of words represents the length of the essay since the length requirement is say 250-300 words.This feature can check the empty essay or essay which is ridiculously short that it cannot be processed and rejects it immediately.Otherwise a score can be assigned accordingly   Stanford toolkit:  POS tagger-     Part-Of-Speech Tagger is used to parse the sentence and the tagged sentence is used to find the number of verbs in the essay. Spell Checking: Hey shall we tell we wrote the code or took the snippet? Grammar Checking: Completed after Review-I Completed before Review-I Completed for Review II ( 2 nd  online submission) Completed for Review II    JLinkGrammar is used to check the number of grammar mistakes.A shell script that runs the JLinkGrammar was written and is called from Net Beans. Content Vector Analysis: The auto corrected test essay is subjected to stop word removal and then stemming.All the keywords from the corpus is extracted and term document matrix is constructed.In the matrix,the first document represents the test essay.Hence,its correlation with the other documents is calculated.Depending on the deviation the score is assigned. Regression: The individual raw scores namely from the feature extraction process,the grammar/spell check process and the content vector analysis process,are taken and weights are assigned for each component.   Text Complexity Feature Score Component1-3% Spell Check Score Component2-7% Grammar Check Score Component3-10% Relation to the topic Score Component4-80% The final score is assigned based on the weighted sum of all the score components. Experiment: Input:Test Essay & Corpus Output:Score Contribution of the Candidate:   Change it if u want See to it all get 100% and we don’t put ourselves   into trouble by writing modulewise split up ;) Implementation: Aruna P-20 Dhivya Priya R-40 Divya Harshini R-40  Documentation: Aruna P-50 Dhivya Priya R-25 Divya Harshini R-25 Background Work: Aruna P-30 Dhivya Priya R-35 Divya Harshini R-35 Project Guide : Dr K S Eashwara Kumar
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks