This HTML5 document contains 36 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

PrefixNamespace IRI
dctermshttp://purl.org/dc/terms/
marcrelhttp://id.loc.gov/vocabulary/relators/
vivohttp://vivoweb.org/ontology/core#
n21http://hub.abes.fr/edp/periodical/articletype/
n9http://hub.abes.fr/namespace/person/mail/31c1fbc26864f2647bdb47552dead9e3/
n15http://hub.abes.fr/edp/periodical/aa/
n8http://www.idref.fr/165084871/
n4http://hub.abes.fr/edp/periodical/aa/2021/volume_652/issue_2021/aa38911-20/authorship/
n6http://hub.abes.fr/edp/periodical/aa/2021/volume_652/issue_2021/aa38911-20/subject/
n2http://hub.abes.fr/edp/periodical/aa/2021/volume_652/issue_2021/aa38911-20/
bibohttp://purl.org/ontology/bibo/
n17http://orcid.org/0000-0001-9226-8992#
rdachttp://rdaregistry.info/Elements/c/
hubhttp://hub.abes.fr/namespace/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n20http://hub.abes.fr/edp/periodical/aa/2021/volume_652/issue_2021/
n19http://hub.abes.fr/namespace/person/mail/31b0e657ea7a4f8f438f40e1ecc2efb7/
n11http://hub.abes.fr/edp/periodical/aa/2021/volume_652/issue_2021/aa38911-20/m/
n12http://hub.abes.fr/referentiel/edparticlecategories/subject/
rdawhttp://rdaregistry.info/Elements/w/
xsdhhttp://www.w3.org/2001/XMLSchema#
n18http://orcid.org/0000-0002-1058-9109#
Subject Item
n2:w
rdf:type
bibo:Article rdac:C10001
dcterms:isPartOf
n20:w
dcterms:subject
n6:galaxiesactive n12:extragalacticastronomy n6:methodsstatistical n6:galaxiesseyfert n6:quasarsgeneral
dcterms:title
Interpreting automatic AGN classifiers with saliency maps
dcterms:date
2021-01-01
rdaw:P10072
n11:print n11:web
vivo:relatedBy
n4:1 n4:5 n4:4 n4:2 n4:3 n4:6
marcrel:aut
n8:id n2:cirois n9:129f751e1bb38e61fc7335b3be24bef0 n2:nardinie n2:pasquatom n2:peruzzit n2:bertonm n17:person n18:person n2:marzianip n19:af5ae8d00e171693bf4a69a45d68e32b
dcterms:abstract
Classification of the optical spectra of active galactic nuclei (AGN) into different types is currently based on features such as line widths and intensity ratios. Although well founded on AGN physics, this approach involves some degree of human oversight and cannot scale to large datasets. Machine learning (ML) tackles this classification problem in a fast and reproducible way, but is often (and not without reason) perceived as a black box. However, ML interpretability and are active research areas in computer science that are providing us with tools to mitigate this issue. We apply ML interpretability tools to a classifier trained to predict AGN types from spectra. Our goal is to demonstrate the use of such tools in this context, obtaining for the first time insight into an otherwise black box AGN classifier. In particular, we want to understand which parts of each spectrum most affect the predictions of our classifier, checking that the results make sense in the light of our theoretical expectations. We trained a support-vector machine on 3346 high-quality, low-redshift AGN spectra from SDSS DR15. We considered either two-class classification (type 1 versus 2) or multiclass (type 1 versus 2 versus intermediate-type). The spectra were previously and independently hand-labeled and divided into types 1 and 2, and intermediate-type (i.e., sources in which the Balmer line profile consists of a sharp narrow component superimposed on a broad component). We performed a train-validation-test split, tuning hyperparameters and independently measuring performance via a variety of metrics. On a selection of test-set spectra, we computed the gradient of the predicted class probability at a given spectrum. Regions of the spectrum were then color-coded based on the direction and the amount by which they influence the predicted class, effectively building a saliency map. We also visualized the high-dimensional space of AGN spectra using t-distributed stochastic neighbor embedding (t-SNE), showing where the spectra for which we computed a saliency map are located. Our best classifier reaches an F-score of 0.942 on our test set (with 0.948 precision and 0.936 recall). We computed saliency maps on all misclassified spectra in the test set and on a sample of randomly selected spectra. Regions that affect the predicted AGN type often coincide with physically relevant features, such as spectral lines. t-SNE visualization shows good separability of type 1 and type 2 spectra. Intermediate-type spectra either lie in-between, as expected, or appear mixed with type 2 spectra. Misclassified spectra are typically found among the latter. Some clustering structure is apparent among type 2 and intermediate-type spectra, though this may be an artifact. Saliency maps show why a given AGN type was predicted by our classifier resulting in a physical interpretation in terms of regions of the spectrum that affected its decision, making it no longer a black box. These regions coincide with those used by human experts, for example relevant spectral lines, and are even used in a similar way; the classifier effectively measures the width of a line by weighing its center and its tails oppositely.
hub:articleType
n21:researcharticle
hub:publisher-id
aa38911-20
dcterms:dateCopyrighted
2021-01-01
dcterms:rights
© ESO 2021
dcterms:rightsHolder
ESO
hub:isPartOfThisJournal
n15:w