Quality Inspection Platform for Civil Judgement Documents
Based on the knowledge of data quality and information quality assessment, this product divides the data into structured and unstructured according to the format. Based on the actual needs of the scene, the design of assessment indicators is completed from subjective and objective dimensions. For structured information, mainly including tag data which can be extracted by keywords. In this paper, data quality is measured by the commonly used measurement method, combined with the measurement dimension mentioned in the objective information theory. It covers seven categories: delicacy, delay, authenticity, integrity, consistency, readability and accuracy. 15 detailed evaluation indexes are designed according to the requirements of civil judgment documents. For the unstructured information of text type in data, we pay more attention to the quality of semantic and pragmatic information contained in it. With the help of supervised machine learning and deep learning technology, we take the large number of data accumulated in the system as positive training samples with high information quality to simulate the artificial information quality evaluation process.
Judicial Text Data Automatic Generation System
We design an automated system for generating judicial text data and implements it, including training and testing data generation module. The training data generation module is used to provide data augmentation services for the judicial deep learning model, increasing the high-quality judicial text training data and improving the prediction accuracy of the model. We design two generation methods based on rule and Variational Auto-encoder. Combining with the characteristics of judicial text, we propose an augmentation method for judicial text in rule-based generation. The generation method based on Variational Auto-encoder applies the Variational Auto-encoder in the field of text generation, learning the low-dimensional features of judicial text, adding noise and reconstructing new text with similar distribution. We conducted extensive experiments to prove that the two training data generation methods provided by the system are effective, and can increase the accuracy of the crime prediction TextCNN model from 81.91 % to 83.31 %.
Model Evaluation System Driven by Judicial Data Quality
After a comprehensive analysis of the background, current situation and system requirements, we design and implement a model evaluation system driven by judicial data quality, which is mainly divided into data interaction, document analysis, quality detection and model evaluation modules. Through this system, users can select or upload machine learning models, select the built-in documents or upload the data set of judgment documents, carry out judicial intelligence classification tasks such as law articles prediction, and calculate the model evaluation index. Among them, KNN, SVM and Naive Bayes are the basic machine learning models built in the system, and the model evaluation indexes used in the system are Accuracy, F1 score, KS value and PSI. For the built-in or uploaded judgment documents, the system will automatically perform field parsing, label classification and feature extraction, and generate quality inspection reports. The quality attributes include interpretability, relevancy, accuracy and consistency.
System for Testing Judicial Case Screening System
The main work of this product is to set up a series of multi-dimensional case screening test criteria and to build a highly functional testing platform for case screening system. This system conducts the test of case screening system in two aspects: the model level and the system level. The relevant machine learning models for the case screening system are evaluated by basic and extensive metrics based on the specific data set. Multi-dimensional similarity metrics test the system-level interfaces of the case screening system. The testing system eventually works well in the form of a web application. The user can access the system through a browser to manage data set and model files, conduct testing tasks automatically to generate corresponding testing results. This testing system could be highly practical for developers and testers in the legal field.
- National natural science foundation of China (Key Program)：Software testing technology for
security-critical deep learning system(61832009), 2019-2023
- National natural science foundation of China (General Program)：Research on detection and repair
technology of software numerical stability(61772260），2018-2021
- National natural science foundation of China (Major Program): Convergence feedback mechanism and
support platform for massive information in software development (61690201), 2017-2021
- National key R&D program of China：Research and development of people's court business and
data standards, construction of basic judicial service database and case screening and
evaluation model (2016YFC0800805), 2016-2020
- National program on key basic research project (973 Program): Research on the construction and
quality assurance of security critical software systems (2014CB340700), 2014-2018
- National natural science foundation of China (Projects of international cooperation and
exchanges): Stability analysis of numerical programs based on transmutation relation metamorphic
relations (61311130424), 2013
- Supported by projects of international cooperation and exchanges NSFC：Research on software
engineering recommendation technology based on developer social network (61211120438), 2012
- National natural science foundation of China：Stability analysis for numerical programs
- National natural science foundation of China：Reliability testing for numerical programs
- National natural science foundation of China：Software testing techniques in scientific
- Zixi Liu, Yang Feng, Yining Yin, Zhenyu Chen. DeepState: Selecting Test Suites to Enhance the Robustness of Recurrent Neural Networks ICSE 2022
- Xinyu Gao, Yang Feng, Yining Yin, Zixi Liu, Zhenyu Chen, Baowen Xu. Adaptive Test Selection for Deep Neural Networks ICSE 2022
- Zhang X, Liu J, Sun N, et al. Duo: Differential Fuzzing for Deep Learning Operators [J]. IEEE Transactions on Reliability, 2021, 70(4): 1671-1685.
- Pin Ji, Yang Feng, Jia Liu, Zhihong Zhao, Baowen Xu. Automated Testing for Machine Translation via Constituency Invariance ASE 2021
- Ni Y, Xia Z, Zhao F, et al. An Online Multi-step-forward Voltage Prediction Approach based on LSTM-TD Model and KF Algorithm [J]. Computer, 2021, 54(8): 56-65.
- Hou Y, Liu J, Wang D, et al. TauMed: Test Augmentation of Deep Learning in Medical Diagnosis [C]//Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2021: 674-677.
- Zhang X, Sun N, Fang C, et al. Predoo: Precision Testing of Deep Learning Operators [C]//Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2021: 400-412.
- Yu S, Fang C, Yun Y, et al. Layout and Image Recognition Driving Cross-platform Automated Mobile Testing [C]//2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021: 1561-1571.
- Zhai J, Shi Y, Pan M, et al. C2S: translating natural language comments to formal program specifications [C]//Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2020: 25-37.
- Gong A, Zhong Y, Zou W, et al. Incorporating Android Code Smells into Java Static Code Metrics for Security Risk Prediction of Android Applications [C]//2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2020: 30-40.
- Guo Z, Liu J, He T, et al. TauJud: Test Augmentation of Machine Learning in Judicial Documents [C]//Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2020: 549-552.
- Fang C, Liu Z, Shi Y, et al. Functional Code Clone Detection with Syntax and Semantics Fusion Learning [C]//Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2020: 516-527.
- Feng Y, Shi Q, Gao X, et al. Deepgini: prioritizing massive tests to enhance the robustness of deep neural networks [C]//Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2020: 177-188.
- Zou W, Lo D, Kochhar P S, et al. Smart contract development: challenges and opportunities [J]. IEEE Transactions on Software Engineering, 2019.
- He T, Yu S, Wang Z, et al. From data quality to model quality: an exploratory study on Deep Learning [C]//Proceedings of the 11th Asia-Pacific Symposium on Internetware. 2019: 1-6.
- Lei C, Hu B, Wang D, et al. A preliminary study on data augmentation of Deep Learning for image classification [C]//Proceedings of the 11th Asia-Pacific Symposium on Internetware. 2019: 1-6.
- Cao J, Wang X, Li Z, et al. The evolution of open-source blockchain systems: an empirical study [C]//Proceedings of the 11th Asia-Pacific Symposium on Internetware. 2019: 1-10.
- Wang D, Wang Z, Fang C, et al. DeepPath: Path-driven testing criteria for Deep Neural Networks [C]//2019 IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, 2019: 119-120.
- Wu H, Wang X, Xu J, et al. Mutation testing for ethereum smart contract [J]. arXiv preprint arXiv:1908.03707, 2019.
- Zhang X, Yin Z, Feng Y, et al. NeuralVis: visualizing and interpreting deep learning models [C]//2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2019: 1106-1109.
- Shi Q, Wan J, Feng Y, et al. DeepGini: prioritizing massive tests to reduce labeling cost [J]. arXiv, 2019: arXiv: 1903.00661.
- Gu Q , Cai W , Yu S , et al. Judicial image quality assessment based on deep learning: an exploratory study [C]// 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2019.
- Zou W, Zhang W, Xia X, et al. Branch use in practice: A large-scale empirical study of 2,923 projects on github [C]//2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2019: 306-317.
- Zou W, Xuan J, Xie X, et al. How does code style inconsistency affect pull request integration? An exploratory study on 117 GitHub projects [J]. Empirical Software Engineering, 2019, 24(6): 3871-3903.
- Wang X, Wang H, Su Z, et al. Global optimization of numerical programs via prioritized stochastic algebraic transformations [C]//2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 2019: 1131-1141.
- Shen W, Wan J, Chen Z. MuNN: Mutation analysis
of neural networks. [C]//2018 IEEE International Conference on
Software Quality, Reliab
- Feng Y, Jones J, Chen Z, et al. An empirical study on software failure classification with multi-label and problem-transformation techniques [C]//2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 2018: 320-330.
- He T K, Lian H, Qin Z M, et al. PTM: A topic model for the inferring of the penalty [J]. Journal of Computer Science and Technology, 2018, 33(4): 756-767.
Before the 2017
- Tang E, Zhang X, Müller N T, et al. Software numerical Instability detection and diagnosis by combining stochastic and infinite-precision testing [J]. IEEE Transactions on Software Engineering, 2016, 43(10): 975-994.[Chinese Brief]
- He T, Yin H, Chen Z, et al. A spatial-temporal topic model for the semantic annotation of POIs in LBSNs [J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2016, 8(1): 1-24.
- Wang Y, Gao R, Chen Z, et aL. WAS: a weighted attribute-based strategy for cluster test selection [J]. Journal of Systems and Software, 2014, 98: 44-58.
- Wei S, Tang E, Liu T, et al. Automatic Numerical Analysis Based on Infinite-precision Arithmetic [C]//2014 Eighth International Conference on Software Security and Reliability (SERE). IEEE, 2014: 216-224.
- Xia X, Feng Y, Lo D, et al. Towards more accurate multi-label software behavior learning [C]//2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE). IEEE, 2014: 134-143.[Chinese Brief]
- Liu J, Wang W, Chen Z, et al. A novel user-based collaborative filtering method by inferring tag ratings [J]. ACM SIGAPP Applied Computing Review, 2012, 12(4): 48-57.
- Feng Y, Chen Z. Multi-label software behavior learning [C]//2012 34th International Conference on Software Engineering (ICSE). IEEE, 2012: 1305-1308.
- Chen Z, Chen T Y, Xu B. A revisit of fault class hierarchies in general Boolean specifications [J]. ACM Transactions on Software Engineering and Methodology (TOSEM), 2011, 20(3): 1-11.
- Chen Z Y, Xu B W, Ding D C. The complexity of variable minimal formulas [J]. Chinese Science Bulletin, 2010, 55(18): 1957-1960.
- Ji C, Chen Z, Xu B, et al. A Novel Method of Mutation Clustering Based on Domain Analysis [C]//Seke. 2009, 9: 422-425.