Abstract
By implementing data-driven models for the 2011 Great East Japan earthquake and tsunami, the present study aims at investigating the effect of the level of spatial aggregation of the data on model's predictive ability and at identifying the possible existence of regional-dependent patterns affecting model's accuracy and feature importance. An extended version of the dataset compiled by the Japanese Ministry of Land, Infrastructure and Transportation (MLIT) after the 2011 event in the Tohoku region was used to generate sub datasets at different spatial scales, ranging from individual cities of different sizes to clusters at regional and multi-regional levels. The results indicate a high variance in the accuracy for the models trained on the different subsets, with relative hit rates ranging from 0.68 to 0.89 and exhibiting a positive correlation with the cardinality of the sets, as well as some regional patterns in the prediction errors. The cluster-averaged feature importance is observed to be stable for all selections and reflects the results obtained from the models trained on the whole dataset, thus allowing a more informed identification of the most significant influencing factors for tsunami damage modelling.
Original language | English |
---|---|
Journal | COMPDYN Proceedings |
Publication status | Published - 2023 |
Event | 9th ECCOMAS Thematic Conference on Computational Methods in Structural Dynamics and Earthquake Engineering, COMPDYN 2023 - Athens, Greece Duration: 2023 Jun 12 → 2023 Jun 14 |
Keywords
- Building vulnerability
- Feature importance
- Machine learning
- Tohoku tsunami
- Tsunami damage